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DIAGNOSIS OF HYPERINSULINEMIA AND TYPE II DIABETES AND 
PROTECTION AGAINST SAME BASED ON GENES DIFFERENTIALLY 
EXPRESSED IN MUSCLE CELLS (15.1) 
Cross -Reference to Related Applications 

Anti-Aging Applications . Mice with a disrupted growth 
hormone receptor/binding protein gene enjoy an increased 
lifespan. In U.S. Prov. Appl . 60/485,222, filed July 8, 
2003 (Kopchick8) mouse genes differentially expressed in 
comparisons of gene expression in growth hormone 
receptor/binding protein gene-disrupted mouse livers and 
normal mouse livers were identified, as were corresponding 
human genes and proteins. It was suggested that the human 
molecules, or antagonists thereof, could be used for 
protection against faster- than-normal biological aging, or 
to achieve slower- than-normal biological aging. It was also 
taught that the human molecules may also be used as markers 
of biological aging. 

In provisional application Ser . No. 60/474,606, filed 
June 2, 2003 (our docket Kopchick7-USA) , our research group 
used a gene chip to study the genetic changes in the liver 
of C57B1/6J mice that occur at frequent intervals of the 
aging process. Differential hybridization techniques were 
used to identify mouse genes that are differentially 
expressed in mice, depending upon their age. The level of 
gene expression of approximately 10,000 mouse genes (from 
the Amersham Codelink UniSet Mouse I Bioarray, product 
code: 300013) in the liver of mice with average ages of 35, 
49, 56, 77, 118, 133; 207, 403, 558 and 725 days was 
determined. In essence, complementary RNA derived from mice 
of different ages was screened for hybridization with 
oligonucleotide probes each specific to a particular mouse 
gene, .each gene in turn representative of a particular mouse 
gene cluster (Unigene) . Mouse genes which were 
differentially expressed (younger vs. older), as measured by 
different levels of hybridization of the respective cRNA 
samples with the particular probe corresponding to that 
mouse gene, were identified. Related human genes and 
proteins were identified by sequence comparisons to the 
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mouse gene or protein. In the international appl . 
Kopchick7A-PCT, filed June 2, 2004, we.added some additional 
studies of CIDE-A (see below) . 

In a like manner, the effect of aging on the expression 
5 of genes in mouse skeletal muscle was studied, see 

provisional application Ser. No. 60/566,068, filed April 29, 
2 004 (our docket Kopchickl4 -USA) . 

Anti -Diabetes Applications. In U.S. Provisional Appl. 

10 Ser. No. 60/458,398 (our docket Kelderl-USA) , filed March 
31, 2003, members of our research group describe the 
identification of genes differentially expressed in normal 
vs. hyperinsulinemic , hyperinsulinemic vs. type II diabetic, 
or normal vs. type II diabetic mouse liver. Forward- and 

15 reverse-substracted cDNA libraries were prepared, clones 
were isolated, and differentially expressed cDNA inserts 
were sequenced and compared with sequences in publicly 
available sequence databases. The corresponding mouse and 
human genes and proteins were identified. 

2 0 The purpose of our research group's provisional 

application Ser. No. 60/460,415 (our docket: Kopchick6- 
USA) , filed April 7, 2003, was similar, but complementary 
RNA, derived from RJSTA of mouse liver, was screened against a 
mouse gene chip. See also 60/506,716, filed Sept. 30, 2003 

2 5 (Kopchick6 . 1) . 

Gene chip analyses have also been used to identify 
genes differentially expressed in normal vs. 

hyperinsulinemic, hyperinsulinemic vs. type II diabetic, or 

normal vs. type II diabetic mouse pancreas, see U.S. 
30 Provisional Appl. 60/517,376, filed Nov. 6, 2003 

(Kopchickl2) and muscle, see U.S Provisional Appl. 

60/547,512, filed Feb. 26, 2004 (Kopchickl5) . 

Other differential hybridization applications . The use 

of differential hybridization to identify genes and proteins 
35 is also described in our research group's Ser. No. 

PCT/US00/12145 (Kopchick 3A-PCT) , Ser. No. PCT/US00/12366 

(Kopchick4A-PCT) , and Ser. No. 60/400,052 (Kopchick5) . 
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All of the foregoing applications are hereby- 
incorporated by reference in their entirety. 



5 BACKGROUND OF THE INVENTION 

Field of the Invention 

The invention relates to various nucleic acid molecules 
and proteins, and their use in (1) diagnosing 
hyperinsulinemia and type II diabetes, or conditions 
10 associated with their development, and (2) protecting 
mammals (including humans) against them. 

Description of the Background Art 

15 

Diabetes 

A deficiency of insulin in the body results in diabetes 
mellitus, which affects about 18 million individuals in the 
United States. It is characterized by a high blood glucose 

20 (sugar) level and glucose spilling into the urine due to a 
deficiency of insulin. As more glucose concentrates in the 
urine, more water is excreted, resulting in extreme thirst, 
rapid weight loss, drowsiness, fatigue, and possibly 
dehydration. Because the cells of the diabetic cannot use 

25 glucose for fuel, the body uses stored protein and fat for 
energy, which leads to a buildup of acid (acidosis) in the 
blood. If this condition is prolonged, the person can fall 
into a diabetic coma, characterized by deep labored 
breathing and f rui ty-odored breath. 

30 There are two types of diabetes mellitus, Type I and 

Type II. Type II diabetes is the predominant form found in 
the Western world; fewer than 8% of diabetic Americans have 
the type I disease. 

35 Type I diabetes. In Type I diabetes, formerly called 

juvenile -onset or insulin-dependent diabetes mellitus, the 
pancreas cannot produce insulin. People with Type I diabetes 
must have daily insulin injections. But they need to avoid 
taking too much insulin because that can lead to insulin 
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shock, which begins with a mild hunger. This is quickly- 
followed by sweating, shallow breathing, dizziness, 
palpitations, trembling, and mental confusion. As the blood 
sugar falls, the body tries to compensate by breaking down 
5 fat and protein to make more sugar. Eventually, low blood 

sugar leads to a decrease in the sugar supply to the brain, 
resulting in a loss of consciousness. Eating a sugary food 
can prevent insulin shock until appropriate medical measures 
can be taken. 

10 Type I diabetics are often characterized by their low 

or absent levels of circulating endogenous insulin, i.e., 
hypoinsulinemia (1) . Islet cell antibodies causing damage 
to the pancreas are frequently present at diagnosis. 
Injection of exogenous insulin is required to prevent 

15 ketosis and sustain life. 

Type JT diabetes. Type II diabetes, formerly called 
adult-onset or non- insulin-dependent diabetes mellitus 
(NIDDM) , can occur at any age. The pancreas can produce 
20 insulin, but the cells do not respond to it. 

Type II diabetes is a metabolic disorder that affects 
approximately 17 million Americans. It is estimated that 
another 10 million individuals are "prone" to becoming 
diabetic. These vulnerable individuals can become resistant 
25 to insulin, a pancreatic hormone that signals glucose (blood 
sugar) uptake by fat and muscle. In order to maintain normal 
glucose levels, the islet cells of the pancreas produce 
more insulin, resulting in a condition called 
hyperinsulinemia . When the pancreas can no longer produce 
3 0 enough insulin to compensate for the insulin resistance, and 
thereby maintain normal glucose levels, hyperglycemia 
(elevated blood glucose) results, and type II diabetes is 
diagnosed . 

Early Type II diabetics are often characterized by 
35 hyperinsulinemia and resistance to insulin. Late Type II 
diabetics may be normoinsulinemic or hypoinsulinemic . Type 
II diabetics are usually not insulin dependent or prone to 
ketosis under normal circumstances. 

Little is known about the disease progression from the 
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normoinsulinemic state to the hyperinsulinemic state, and 
from the hyperinsulinemic state to the Type II diabetic 
state . 

As stated above, type II diabetes is a metabolic 
5 disorder that is characterized by insulin resistance and 
impaired glucose-stimulated insulin secretion (2,3,4). 
However, Type II diabetes and atherosclerotic disease are 
viewed as consequences of having the insulin resistance 
syndrome (IRS) for many years (5) . The current theory of 

10 the pathogenesis of Type II diabetes is often referred to as 
the "insulin resistance/islet cell exhaustion" theory. 
According to this theory, a condition causing insulin 
resistance compels the pancreatic islet cells to 
hypersecrete insulin in order to maintain glucose 

15 homeostasis. However, after many years of hypersecretion, 

the islet cells eventually fail and the symptoms of clinical ' 
diabetes are manifested. Therefore, this theory implies 
that, at some point, peripheral hyperinsulinemia will be an 
antecedent of Type II diabetes. Peripheral hyperinsulinemia 

2 0 can be viewed as the difference between what is produced by 
the P cell minus that which is taken up by the liver. 
Therefore, peripheral hyperinsulinemia can be caused by 
increased P cell production, decreased hepatic uptake or 
some combination of both. It is also important to note that 

25 it is not possible to determine the origin of insulin 
resistance once it is established since the onset of 
peripheral hyperinsulinemia leads to a condition of global 
insulin resistance. 

Multiple environmental and genetic factors are involved 

30 in the development of insulin resistance, hyperinsulinemia 
and type II diabetes. An important risk factor for the 
development of insulin resistance, hyperinsulinemia and type 
II diabetes is obesity, particularly visceral obesity 
(6,7,8). Type II diabetes exists world-wide, but in 

35 developed societies, the prevalence has risen as the average 
age of the population increases and the average individual 
becomes more obese . 

Obesity and Diabetes . Obesity is a serious and growing 
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problem in the United States. Obesity-related health risks 
include high blood pressure, hardening of the arteries, 
cardiovascular disease, and Type II diabetes (also known as 
non- insulin-dependent diabetes mellitus, Type II 
5 diabetes) (9,10,11). Recent studies show that 85% of the 
individuals with Type II diabetes are obese (12) . 

Treatment of Diabetes . For many years, treatment was 
insulin therapy for Type I and oral sulfonylureas and/or 

10 insulin therapy for Type II. Metformin (glucophage) was the 
first antidiabetic drug approved by FDA (May 1995) for the 
treatment of Type II diabetes since the oral sulfonylureas 
were introduced in 1984. Metformin promotes the use of 
insulin already in the blood. This May 1995 approval was 

15 followed by the September 1995 approval of another 

antidiabetic drug, Acarbose (precose) . It slows down the 
digestion and absorption of complex sugars, which reduces 
blood sugar levels after meals. 

Before 1982, insulin was purified from beef or pork 

20 pancreas. This was a problem for those diabetics allergic to 
animal insulin. Researchers produced a synthetic insulin 
called humulin. Approved by FDA in 1982, it was the first 
genetically engineered consumer health product manufactured 
for diabetics. Synthetic insulins can be produced in 

25 unlimited quantities. 

Another possible treatment for diabetes includes 
surgically replacing the pancreas 1 endocrine tissues (islets 
of Langerhans) with healthy islet of Langerhans tissue 
grafts. Since 1988, 45 patients worldwide have undergone 

30 successful transplantation. 

Complications. Complications of diabetes (end organ 
damage) include retinopathy, neuropathy, and nephropathy 
(traditionally designated as microvascular complications) as 
35 well as atherosclerosis (a macrovascular complication) . 

Early stages of hyperglycemia can usually be controlled by 
an alteration in diet and increasing the amount of exercise, 
but drug treatment, including insulin, may be required. It 
has been shown that meticulous blood glucose control can 
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often slow down or halt the progression of diabetic 
complications if caught early enough (1) . However, tight 
metabolic control is extremely difficult to achieve. 

5 

Animal Models 

Transgenic Mouse Models of Diabetes or Diabetes 
Resistance. McGrane, et al . , J. Biol. Chem. 263:11443-51 
(1988) and Chen, et al . , J. Biol. Chem., 269:15892-7 (1994) 

10 describe the genetic engineering of mice to express bovine 
growth hormone (bGH) or human growth hormone (hGH) , 
respectively. These mice exhibited an enhanced growth 
phenotype . They also developed kidney lesions similar to 
those seen in diabetic glomerulosclerosis, see Yang, et al . , 

15 Lab. Invest., 68:62-70 (1993). Ogueta, et al . , J. 

Endocrinol., 165: 321-8 (2000) reported that transgenic mice 
expressing bovine GH develop arthritic disorder and self- 
antibodies . 

Growth hormone has many roles, ranging from regulation 

20 of protein, fat and carbohydrate metabolism to growth 

promotion. GH is produced in the somatrophic cells of the 
anterior pituitary and exerts its effects either through the 
GH-induced action of IGF-I, in the case of growth promotion, 
or by direct interaction with the GHR on target cells 

25 including liver, muscle, adipose, and kidney cells. 

Hyposecretion of GH during development leads to dwarfism, 
and hypersecretion before puberty leads to gigantism. In 
adults, hypersecretion of GH results in acromegaly, a 
clinical condition characterized by enlarged facial bones, 

3 0 hands, feet, fatigue and an increase in weight. Of those 
individuals with acromegaly, 25% develop type II diabetes. 
This may be due to insulin resistance caused by the high 
circulating levels of GH leading to high circulating levels 
of insulin (Kopchick et al . , Annual Rev. Nutrition 1999. 

35 19 :437-61) . 

A further mode of GH action may be through the 
transcriptional regulation of a number of genes contributing 
to the physiological effects of GH. 
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Growth hormone genes and the proteins encoded by them 
can be converted into growth hormone antagonists by- 
mutation, see Kopchick USP 5,350,836. Transgenic mice have 
been made that express the GH antagonists bGH-G119R or hGH 
5 G120R, and which exhibit a dwarf phenotype . Chen, et al . , 
J. Biol. Chem. , 263:15892-7 (1994); Chen, et al . , Mol . 
Endocrinol, 5:1845-52 (1991); Chen, et al . , Proc . Nat. Acad. 
Sci. USA 87:5061-5 (1990). These mice did not develop 
kidney lesions. See Yang (1993), supra . 

10 Chen, et al . , Endocrinol, 136:660-7 (1995) compared the 

effect of streptozotocin treatment in normal nontransgenic 
mice, and in mice transgenic for (1) a GH receptor 
antagonist, the G119R mutant of bovine growth hormone or (2) 
the E117L-mutant of bGH. (According to Chen's ref. 24, 

15 these large GH transgenic streptozotocin- treated mice 

constitute an animal model for diabetes.) Glomerulosclerosis 
was seen in diabetic (STZ- treated) nontransgenic mice and in 
diabetic bGH-E117L mice, but not in diabetic bGH-Gl 19R (GH 
antagonist) mice. 

2 0 Two of the proteins which mediate growth hormone 

activity are the growth hormone receptor and the growth 
hormone binding protein, encoded by the same gene in 
mice (GHR/BP) . It is possible to genetically engineer mice 
so that the gene encoding these proteins is disrupted 

25 ( "knocked -out" ; inactivated), see Zhou, et al . , Proc. Nat. 

Acad. Sci. (USA), 94:13215-20 (1997). Zhou, et al . 
inactivated the GHR/BP gene by replacing the 3 1 portion of 
exon 4 (which encodes a portion of the GH binding domains) 
and the 5' region of intron 4 with a neomycin gene cassette. 

30 The modified gene was introduced into the target mice by 
homologous recombination. Like mice expressing a GH 
antagonist, homozygous GHR/BP- KO mice exhibit a dwarf 
phenotype. GHR/BP-KO mice, made diabetic by streptozotocin 
treatment, are protected from the development of diabetes- 

35 associated nephropathy. Bellush, et al . , Endocrinol., 
141 : 163-8 (2000) . 

High-Fat Diets. High-fat diets have been shown to 
induce both obesity and Type II diabetes in laboratory 
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animals (13) . Surwit and colleagues demonstrated that male 
C57BL/6J mice are extremely sensitive to the diabetogenic 
effects of a high- fat diet when initiated at weaning. At 
six months of age, high- fat fed animals had significantly 
5 elevated fasting blood-glucose and insulin levels and also 
demonstrated a decrease in insulin sensitivity (14) . Ahren 
and colleagues (15) reported evidence of insulin resistance 
as well as diminished glucose-stimulated insulin release, 
after feeding with a high-fat diet for 12 weeks. These mice 
10 also showed elevated levels of total cholesterol, 

triglycerides, and free fatty acids, another hallmark of 
Type II diabetes. 

15 Anatomy and Physiology of Muscle 

Muscle tissue constitutes about 4 0% of the body mass. 
Muscles may be classified by location, i.e., skeletal if 
attached to bone, cardiac if forming the wall of the heart, 
and visceral if associated with another body organ. Muscles 

20 may also be classified as voluntary or involuntary, 

depending on how their contractions and relaxations are 
controlled. Skeletal muscles are voluntary, while cardiac 
and visceral muscles are involuntary. It is also possible 
to classify muscles morphologically; skeletal and cardiac 

25 muscle cells are striated, whereas visceral muscle cells are 
not . 

Each skeletal muscle is composed of many individual muscle 
cells called muscle fibers. The fibers are held together by 

30 fibrous connective- tissue membranes called fascia. The 

fascium which envelops the entire muscle is the epimysium, 
and the fascia which penetrate the muscle, separating the 
fibers into bundles (fasciculi) are called perimysium. Very 
thin fascia (endomysium) sheath each muscle fiber. Skeletal 

35 muscles are attached either directly to a bone, or 
indirectly through a tendon. 

The individual muscle fibers (cells) comprise threadlike 
protein structures called myofibrils. 
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There are over 600 muscles in the human body. We will have 
occasion later to refer to the gastrocnemius. It is a 
superficial muscle in the posterior compartment of the lower 
leg, which together with the underlying soleus forms the 
5 characteristic bulge of the calf. 

Role of Muscle in Development of Type II Diabetes 

Muscle, fat and liver tissues are the major 
contributors to the development of insulin resistance, 

10 hyperinsulinemia, and, ultimately, type II diabetes. 

Muscle cells respond to insulin by increasing glucose uptake 
from the bloodstream. Muscle tissue can become resistant to 
insulin, causing the beta cells to initially increase 
insulin secretion. Eventually, though, the beta cells become 

15 unable to compensate for this increasing insulin resistance 
from muscle and other cells, and they fail to respond to 
elevated blood glucose levels. Thus, clinical type 2 
diabetes results from the combination of insulin resistance 
and impaired beta cell function. 

2 0 Defects in muscle glycogen synthesis are known to play 

a role in the development of insulin resistance. At least 
three steps-those mediated by glycogen synthase, hexokinase, 
and GLUT4-have been reported to be defective in patients 
with type 2 diabetes . 

25 Fatty acids can induce insulin resistance, and it has 

been suggested that this was a consequence of altered 
insulin signaling through PI3-kinase. PKC-theata has also 
been implicated. 

See generally Petersen, et al . , "Pathogenesis of 

30 Skeletal muscle insulin resistance in type 2 diabetes 

mellitus", in "A Symposium: Evolution of type 2 diabetes 
mellitus management", at Amer. J". Cardiol., 90 (5A): 11G-18G, 
(Sept . 5, 2002) . 

35 

Adverse Effects of Type II Diabetes on Muscle 

"Myopathy is a general term used to describe any 
disease of muscles, such as the muscular dystrophies and 
myopathies associated . with thyroid disease. It can be caused 
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by endocrine disorders, including diabetes, metabolic 
disorders, infection or inflammation of the muscle, certain 
drugs and mutations in genes. In diabetes, myopathy is 
thought to be caused by neuropathy, a complication of 
5 diabetes . General symptoms of myopathies include muscle 
weakness of limbs sometimes occurring during exercise 
although in some cases the symptoms diminish as exercise 
increases. Depending on the type of myopathy, one muscle 
group may be more affected than others." See "Joint and 
10 Muscle Problems Associated with Diabetes" , 

www. iddt international . org/i oint andmuscl eprobl ems .html [Last 
modified June 12, 2003] . 

Diabetic muscle infarction can spontaneously affect 

15 patients with a long history of poorly controlled diabetes. 
"Most affected patients have multiple microvascular 
complications (neuropathy, nephropathy, and retinopathy) . 
The clinical presentation is an acute onset of pain and 
swelling over days to weeks in the affected muscle groups 

2 0 (usually the thigh or calf) , along with varying degrees of 
tenderness.... Therapy consists of rest and analgesia. 
Routine daily activities are not deleterious to the 
condition, but physical therapy may cause exacerbation. 
Spontaneous diabetic muscle infarction tends to resolve over 

2 5 a period of weeks to months in most cases." See 

"Musculoskeletal Complications of Diabetes - Part 2", 
www. diabetic -lifestyle . com/art icles/i an02_whats_l . htm [last 
modified Feb. 9, 2004]. See also Truj illo-Santos , et al . , 
"Diabetes muscle infarction: an underdiagnosed complication 

30 of long-standing diabetes," Diabetes Care, 26(l):211-5 
(2003) . 



35 
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Identification of genes involved in hyper insulinemia and 
type II diabetes, generally 

5 Our attention recently has focused on the generation of 

muscle mRNA expression profiles and the identification of 
genes involved in the genesis of the obesity- induced 
hyperinsulinemia and type-II diabetes. To date, no one has 
attempted to study the actual progression from the normal 

10 condition to that of hyperinsulinemia or from 

hyperinsulinemia to Type II diabetes in an attempt to 
identify genes that are up-regulated or down- regulated in 
muscle as the disease progresses. 

In previous studies aimed at identifying genes involved 

15 in diabetes- induced glomerulosclerosis, differential display 
and traditional subtractive hybridization techniques were 
used (16-20) . While effective for the identification of a 
few genes (e.g. hmuncl3, PED/PEA-15, lactate dehydrogenase, 
amiloride sensitive sodium channel, ubiquitin-like protein, 

20 mdr 1, and a-amyloid protein precursor as well as a few 

novel genes), these techniques can be quite labor intensive. 
The PCR-based method of subtractive hybridization requires 
less starting material, and allows the simultaneous 
isolation of all differentially expressed cDNAs into two 

25 groups (up-regulated and down- regulated) . 

However, the PCR-based method of subtractive 
hybridization is also quite labor-intensive, produced large 
numbers of false positive candidates and ultimately resulted 
in the identification of a relatively limited number of 

3 0 differentially expressed genes, (see Kelderl-USA 
application) . 

In order to expand the number of genes that can be 
analyzed simultaneously, several groups have begun to 
utilize DNA microarray analysis to measure differences in 

35 gene expression between normal and diseased states. 

However, these experiments have been limited in regards to 
the number of experimental conditions analyzed. DNA 
microarray analysis has been performed on normal, obese and 
diabetic mice (21) . Also, the obesity and diabetes in the 
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mouse models examined were caused by a specific endogenous 
genetic mutation (22) . The differentially expressed genes 
in the above models may be very different from genes 
differentially expressed due to diet -induced obesity and 
5 Type- I I diabetes. 

The use of differential expression and related techniques to 
identify genes useful in the treatment of diabetes has been 
10 reviewed by Perfetti, et al . , Diabetes Technol . & 

Therapeut., 5(3): 421-3 (2003). Bernal -Mizrachi , et al . , 
Diabetes Metab. Res. Rev. 19: 32-42 (2003). 

Other papers of interest include : 
15 Wada, et al . , "Gene expression profile in 

streptozotocin- induced diabetic mice kidneys undergoing 
glomerulosclerosis", Kidney Int, 59:1363-73 (2001); 

Song, et al . , "Cloning of a novel gene in the human 
kidney homologous to rat muncl3S: its potential role in 
20 diabetic nephropathy", Kidney Int., 53:1689-95 (1998); 

Page, et al . , "Isolation of diabetes-associated kidney 
genes using differential display", Biochem. Biophys. Res. 
Comm., 232:49-53 (1997). 

Peradi, "Subtractive hybridization claims: An efficient 
2 5 technique to detect overexpressed mRNAs in diabetic 
nephropathy," Kidney Int. 53:926-31 (1998). 

Condorelli, EMBO J., 17:3858-66 (1998). 

Diabetes-Specific Differential Expression in Muscle 

30 Sreekumar, et al . , "Gene expression profile in skeletal 
msucle of type 2 diabetes and the effect of insulin 
treatment," Diabetes 51: 1913 (June 2002) surveyed 6,451 
genesw, and identified 85 genes for which there was an 
alteration in skeletal muscle transcription in diabetic 

35 patients after withdrawal of insulin treatment. Subsequent 
insulin treatment resulted in further changes in 
transcription of 74 of the 85 genes (15 increased, 59 
decreased) , and also resulted in alteration of 29 additional 
gene transcripts. 
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Mootha, et al . , "PCG-la responsive genes involved in 
oxidative phosphorylation are coordinatively downregulated 
in human diabetes," Nature Genetics 34(3); 267 (July 2003), 
5 used DNA microarrays to detect changes in the expression of 
sets of related genes, rather than of individual genes. They 
classified over 22,000 genes into 149 data sets; some of 
these data sets overlapped. They looked for a statistical 
correlation between the overall rank order of the genes in 

10 differential expression, and the groups to which the genes 
belonged. Expression was compared pairwise among three 
groups: males with normal glucose tolerance; males with 
impaired glucose tolerance; and males with type 2 diabetes. 
The set with the highest enrichment score (the one whose 

15 members ranked highly most often relative to chance 

expectation) was an internally curated set of 10 6 genes 
involved in oxidative phosphorylation. While the average 
decrease for the individual genes was modest (-2 0%) , it was 
also consistent, being observed in 89% (94/106) of the genes 

20 in question. This paper is reviewed by Toye and Gauguier, 
"Genetics and functional genomics of type 2 diabetes 
mellitus", Genome Biology, 4: 241 (2003) . 

Patti, et al . , "Coordinated reduction of genes of oxidative 
25 metabolism in humans with insulin resistance and diabetes: 
Potential role of PGC1 and NRF1 " , Proc . Nat. Acad. SCi . 
(USA), 100(14): 8466 (July 8, 2003) used microarrays to 
analyze skeletal muscle expression of genes in nondiabetic 
insulin-resistant subjects at high risk for diabetes (based 
30 on family hisotry of diabetes and Mexican-American 

ethnicity) and diabetic Mexican-American subjects. Of 7,129 
sequences represented on the microarray, 187 were 
differentially expressed between control and diabetic 
subjects. However, no single gene remained significantly 
35 differentially expressed after controlling for multiple 

comparison false discovery by using the Benj amini -Hochberg 
method, see Benjamini, et al . , J. R. Stat. Soc . Sert . B. 
57:289-300 (1995); Dudait, et al . , Stat. Sin. 12: 111-139 
(2002) . Consequently, Patti et al . sought to identify 
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groups of related genes with similar patterns of 
differential expression using MAPP FINDER and ONTOEXPRESS . 
According to MAPP FINDER, the top-ranked cellular component 
terms were mitochondrion, mitochondrial - membrane, 
5 mitochondrial inner membrane, and ribosome, and the top- 
ranked process term was ATP biosynthesis. According to 
ONTOEXPRESS, the over- represented groups were energy 
generation, protein biosynthesis/ribosomal proteins, RNA 
binding, ribosomal structural protein, and ATP synthase 
10 complex. 

Huang, Xudong, "Identification of abnormally expressed genes 
in skeletal muscle contributing to insulin resistance and 
type 2 diabetes", Thesis, document id: 9576 Lunds 
15 University 2002, reported differential expression of the 

mitochondrially-encoded ND1 gene in human diabetic patients 
and of the nuclear- encoded cathepsin L gene in mice. 

Standaert, et al . , ":Skeletal muscle insulin resistance in 
20 obesity-associated type 2 diabetes in monkeys is linked to a 
defect in insulin activation of protein kinase C- 
zeta/lambda/iota Diabetes 51: 2936 (Oct. 2002). the authors 
concluded that defective activation of atypical PKCs played 
an important role in the patehogenesis of peripheral insulin 
25 resistance in both obese prediabetic and diabetic monkeys. 

They attributed this linkage to the apparent requirement for 
aPKCs during insul in- stimulated glucose transport. 

Srommer, et al . , Am. J. Physiol .," Skeletal muscle insulin 
30 resistance after trauma: insulin signaling and glucose 

transport", 275(2 Pt . 1): E3518(Aug. 1998) concluded that 
insulin resistance in skeletal muscle after surgical trauma 
is associated with reduced glucose transport but not with 
impaired glucose signaling to PI 3 -kinase or its downstream 
3 5 target, Akt . 

Aging-Specific Differential Expression in Muscle 
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Gene Chip-Based Identification of genes involved in aging of 
skeletal muscle 

Several groups have used DNA microarrays to measure 
differences in gene expression caused by the aging process. 
5 However, these experiments are extremely limited in regards 
to the number of aging time points or experimental 
conditions . 

Weindruch, et al . , "Microarray profiling of gene 
expression in aging and its alteration by caloric 

10 restriction in mice" in Symposium: Calorie Restriction: 

effects on Body Composition, Insulin Signaling and Aging 
918S-923S (2001) (21) compared expression in gastrocnemius 
muscle from 5- and 30-month old C57BL/6 mice, with and 
without caloric restriction. In this analysis, the 

15 expression of 113 genes was found to be changed by at least 
two-fold in 5-month old mice compared to 30-month old mice. 
Caloric restriction of comparable mice caused a reversal of 
the altered gene expression of 33 genes. 

Of the 6347 genes surveyed in the oligonucleotide 

20 microarray, only 58 (0.9%) displayed a greater than 2 fold 
increase in gene expression as a function of aging, whereas 
55(0.9%) displayed a greater than 2 fold decrease. 

Of the genes positively correlated with aging, 16% 
could be assigned to stress responses. The largest 

25 differential expression between young and aged animals (3.8 
fold) was the mitochondrial sarcomeric creatine kinase. 

Of the genes negatively correlated with aging, 13% were 
involved in energy metabolism. A noteworthy number were 
genes encoding biosynthetic enzymes (cytochrome P450 IIC12, 

3 0 squaelene synthase, stearoyl-CoA desaturase, EF-1 -gamma. 

Another down regulator was a CpG binding protein, MeCP2 . 

Weindruch further reported that age-related changes in 
gene expression profile were "remarkably attenuated" by 
caloric restriction. 

35 What appears to be the same experiment is discussed in 

Lee, et al . , "Gene expression profile of aging and its 
retardation by caloric restriction," Science, 2 85: 13 90 (Aug. 
27, 1999) . This papers lists the individual genes which 
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were differentially expressed by more than 2-fold, and 
classifies them as energy metabolism, neuronal factors, 
protein metabolism, stress response, biosynthesis, calcium 
metabolism or DNA repair genes. 

Welle, et al . , "Skeletal muscle gene expression profiles 
in 20-29 year old and 65-71 year old women," Exper. 
Gerontol., 39: 369-77 (2004) and available electronically as 
doi : 10 . 1016/ j . exger . 2003 . 11 . Oil studied gene expression and 
physical condition in seven young and eight older women. 
With respect to physical condition, the measured or 
calculated parameters were total body mass, lean body mass, 
left leg lean mass (by biopsy) , maximum isometric left knee 
extension force, left knee extension force/left keg lean 
mass. Peak VO z /lean body mass, and Peak V0 2 /left leg lean 
mass . 

There were 1178 "probe sets" (representing 1053 
different Unigene clusters) for which differential" 
expression was detected; 550 for which expression was higher 
in older women, and 628 the inverse effect. The differences 
ranged from 1.2 to 4 fold; most (78A%) were less than 1.5 
fold. The complete list of differentially expressed genes is 
given in the Rochester Muscle database website, 
www.urmc.rochester.edu/smd/crc/swindex (".html" omitted, in 
accordance with USPTO requirements, so that the publication 
of this application will not create an active hyperlink) . 

The gene most highly overexpressed in older muscle was 
p21 (cyclin-dependent kinase inhibitor 1A)(4.01 fold). This 
one of several genes (see Welle Table 2) which are 
potentially related to DNA damage and repair. Welle also 
thought it noteworthy how many of the differentially 
expressed genes were ones that encode proteins which bind to 
pre-mRNAs or mRNAs (see Welle Table 3) . 

Other Differentlal/Subtractlve Hybridization Studies of 
Interest 

Zhang, et al., Kidney International, 56:549-558 (1999) 
identified genes up-regulated in 5/6 nephrectomized 
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(subtotal renal ablation) mouse kidney by a PCR-based 
subtraction method. Ten known and nine novel genes were 
identified. The ultimate goal was to identify genes 
involved in glomerular hyperf iltration and 
5 hypertrophy .Mel ia, et al . , Endocrinol., 139:688-95 (1998) 
applied subtractive hybridization methods for the 
identification of androgen-regulated genes in mouse kidney. 
The treatment mice were dosed with dihydrotestosterone , an 
androgen. Kidney androgen-regulated protein gene was used 
10 as a positive control, as it is known to be up-regulated by 
DHT . 

See also Holland, et al . , Abstract 607, "Identification 
of Genes Possibly Involved in Nephropathy of Bovine Growth 
Hormone Transgenic Mice" (Endocrine Society Meeting, June 22, 

15 2000) and Coschigano, et al . , Abstract 333, "Identification 
of Genes Potentially Involved in Kidney Protection During 
Diabetes" (Endocrine Society Meeting, June 22, 2000). 

The following differential hybridization articles may 
also be of interest: Wada, et al . , "Gene expression profile 

20 in streptozotocin-induced diabetic mice kidneys undergoing 

glomerulosclerosis". Kidney Int, 59:1363-73 (2001); Song, et 
al . , "Cloning of a novel gene in the human kidney homologous 
to rat muncl3S: its potential role in diabetic nephropathy", 
Kidney Int., 53:1689-95 (1998); Page, et al . , "Isolation 

25 of diabetes-associated -kidney genes using differential 

display", Biochem. Biophys . Res. Comm., 232:49-53 (1997); 
Peradi, "Subtractive hybridization claims: An efficient 
technique to detect overexpressed mRNAs in diabetic 
nephropathy," Kidney Int. 53:926-31 (1998); Condorelli, 

30 EMBO J., 17:3858-66 (1998). 



Apoptosis and CIDE-A 

Apoptosis is a form of programmed cell death that 
35 occurs in an active and controlled manner to eliminate 

unwanted cells. Apoptotic cells undergo an orchestrated 
cascade of morphological changes such as membrane blebbing, 
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nuclear shrinkage, chromatin condensation, and formation of 
apoptotic bodies which then undergo phagocytosis by- 
neighboring cells. One of the hallmarks of cellular 
apoptosis is the cleavage of chromosomal DNA into discrete 
5 oligonucleosomal size fragments. This orderly removal of 

unwanted cells minimizes the release of cellular components 
that may affect neighboring tissue. In contrast, membrane 
rupture and release of cellular components during necrosis 
often leads to tissue inflammation. 

10 The process of apoptosis is highly conserved and 

involves the activation of the caspase cascade. Cohen, GM. 
(1997) Caspases: the executioners of apoptosis. Biochem. 
J. 326:1-16; Budihardjo, I., Oliver, H. , Lutter, M. , Luo, 
X., Wang, X. (1999) Biochemical pathways of caspase 

15 activation during apoptosis. Annnu. Rev. Cell. Dev. 

Biol. 15:269-290; Jacobson, M.D., Weil, M . , Raff, M.C. 
(1997) Programmed cell death in animal development. Cell 
88:347-354. Caspases are a family of serine proteases that 
are synthesized as inactive proenzymes. Their activation by 

20 - apoptotic signals such as CD95 (Fas) death receptor 

activation or tumor necrosis factor results in the cleavage 
of specific target proteins and execution of the apoptotic 
program. Apoptosis may occur by either an extrinsic pathway 
involving the activation of cell surface death receptors 

25 (DR) or by an intrinsic mitochondrial pathway. Yoon, J-H. 

Gores G.J. (2002) Death receptor-mediated apoptosis and 
the liver. J. Hepatology 37:400-410. 

These pathways are not> mutually exclusive and some 
cell types require the activation of both pathways for 

30 maximal apoptotic signaling. In type-I cells, death 

receptor activation leads to the recruitment and activation 
of caspases-8/10 and the rapid cleavage and activation of 
caspase-3 in a mitochondrial -independent manner. 
Hepatocytes are members of the Type- I I cells in which 

35 mitochondria are essential for DR-mediated apoptosis 

Scaffidi, C, Fulda, S., Srinivasan, A., Friesen, C, Li, 
F., Tomaselli, K.J., Debatin, K.M. , Krammer, P.H., Peter, 
M.E. (1998) Two CD95 (APO-l/Fas) signaling pathways. EMBO 
J. 17:1675-1687. In this pathway, the pro-apoptot ic protein 
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Bid is truncated by activated caspases-8/10 and translocates 
to the mitochondria. Luo, X., Budihardjo, I., Zou, H., 
Slaughter, C, Wang, X. (1998) Bid, a Bcl2 interacting 
protein, mediates cytochrome c release from mitochondria in 
5 response to activation of cell surface death receptors. 
Cell 94:481-490; Li, H., Zhu, H. , Xu, C.J., Yuan, J. 
(1998) Cleavage of BID by caspase 8 mediates the 
mitochondrial damage in the Fas pathway of apoptosis. Cell 
94:491-501. This translocation leads to mitochondrial 

10 cytochrome c release and eventual activation of caspases-3 
and 7 via cleavage by activated caspase- 9. 

One of the substrates for activated caspase-3 is 
the DNA fragmentation factor (DFF) . DFF is composed of a 45 
kDa regulatory subunit (DFF45) and a 40 kDA catalytic 

15 subunit (DFF40). Liu, X., Zou, H., Slaughter, C, Wang, 
X. (1997) DFF, a heterodimeric protein that functions 
downstream of caspase-3 to trigger DNA fragmentation during 
apoptosis. Cell 89:175-184. DFF45 cleavage by activated 
caspase-3 results in its dissociation from DFF40 and allows 

2 0 the caspase -activated DNAse (CAD) activity of DFF4 0 to 

cleave chromosomal DNA into Oligonucleosomal size fragments. 
Liu, X., Li, P., Widlak, P., Zou, H., Luo, X., Garrard, 
W.T., Wang, X. (1998) The 40-kDa subunit of DNA 
fragmentation factor induces DNA fragmentation and chromatin 
25 condensation during apoptosis. Proc . Natl. Acad. Sci . USA. 
95:8461-8466; Halenbeck, R., MacDonald, H., Roulston, A., 
Chen, T.T., Conroy, L., Williams, L.T. (1998) CPAN, a human 
nuclease regulated by the caspase - sensitive inhibitor 
DFF45. Curr Biol. 8:537-540; Nagata, S. (2000) Apoptotic 

3 0 DNA fragmentation. Exp. Cell Res. 256:12-8. 

Recently, a novel family of cell-death-inducing 
DFF45-like effectors (CIDEs) have been identified that 
includes CIDE-A, CIDE-B and CIDE-3/FSP2. Inohara, N. , 
Koseki, T., Chen, S., Wu, X., Nunez, G. (1998) CIDE, a 
35 novel family of cell death activators with homology to the 

4 5 kDa subunit of the DNA fragmentation factor. EMBO J. 

17:2526-2533; Danesch, U. , Hoeck, W., Ringold, G.M. (1992) 
Cloning and transcriptional regulation of a novel adipocyte- 
specific gene, FSP27. CAAT- enhancer -binding protein (C/EBP) 
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and C/EBP-like proteins interact with sequences required for 
differentiation-dependent expression. J. Biol. Chem. 
267:7185-7193; Liang, L . , Zhao, M . , Xu, Z., Yokoyama, K.K., 
Li, T. (2003) Molecular cloning and characterization of 

5 CIDE-3, a novel member of the cell -death- inducing DNA- 

fragmentat ion- factor (DFF45) -like effector family. Biochem. 
J. 370:195-203. 

The CIDEs contain an N-terminal domain that shares 
homology with the N-terminal region of DFF4 5 and may 

10 represent a regulatory region via protein interaction. See 
Inohara, supra; Lugovskoy, A. A. , Zhou, P., Chou, J.J., 
McCarty, J.S., Li, P., Wagner, G. (1999) Solution 
structure of the CIDE-N domain of CIDE-B and a model for 
CIDE-N/CIDE-N interactions in the DNA fragmentation pathway 

15 of apoptosis. Cell 9:747-755. The family members also 

share a C-terminal domain that is necessary and sufficient 
for inducing cell death and DNA fragmentation; see Inohara 
supra. The overexpression of CIDE-A induces cell death that 
can be inhibited by DFF45. However, CIDE-A- induced 

20 apoptosis is not inhibited by caspase-8 inhibitors thereby 
suggesting the presence of additional, caspase - independent , 
pathway (s) for the induction of apoptosis, see Inohara 
supra. Previous reports have indicated that human and mouse 
CIDE-A are expressed in several tissues such as brown 

25 adipose tissue (BAT) and heart and are localized to the 

mitochondria, Zhou, Z., Yon Toh, S., Chen, Z., Guo, K. , Ng, 
CP., Ponniah, S., Lin, S.C., Hong, W., Li, P. (2003) 
Cidea-def icient mice have lean phenotype and are resistant 
to obesity. Nat. Genet. 35:49-56. . In addition to the 

30 ability to induce apoptosis, CIDE-A can interact and inhibit 
UCP1 in BAT and may therefore play a role in regulating 
energy balance, see Zhou supra. 

Previous reports have indicated that CIDE-A is not 
expressed in either adult human or mouse liver tissue, see 

3 5 Inohara supra, Zhou supra. 



The human protein cell death activator CIDE-A is of 
particular interest because of its highly dramatic change in 
liver expression with age, first demonstrated in our 




22 

Kopchick7 application, supra. CIDE-A expression is elevated 
in older normal mice. CIDE-A expression was studied for 
normal C57BI/6J mouse ages 35, 49, 77, 133, 207, 403 and 558 
days. Expression is low at the first five data points, 
5 then rises sharply at 403 days, and again at 558 days. 

CIDE-A was therefore classified as an "unfavorable 
protein", i.e., it was taught that an antagonist to CIDE-A 
could retard biological aging. 

In Kopchick7A-PCT we reported that CIDE-A is also 

10 prematurely expressed in hyperinsulinemic and type- II 
diabetic mouse liver tissue. CIDE-A expression also 
correlates with liver steatosis in diet-induced obesity, 
hyperinsulinemia and type- II diabetes. These observations 
suggest an additional pathway of apoptotic cell death in 

15 Non-Alcoholic Fatty Liver Disease (NAFLD) and that CIDE-A 
may play a role in this serious disease and potentially in 
liver dysfunction associated with type-II diabetes. 



20 
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SUMMARY OF THE INVENTION 

Differential hybridization techniques have been used to 
identify mouse genes that are differentially expressed in 
the muscle (gastrocnemius) of mice, depending upon their 
5 development of hyperinsulinemia or type II diabetes. 

In essence, complementary RNA derived from normal mice, 
or mouse models of hyperinsulinemia or type II diabetes, was 
screened for hybridization with oligonucleotide probes each 
specific to a particular mouse database DNA, the latter 

10 being identified, by database accession number, by the gene 
manufacturer. Each database DNA in turn was also identified 
by the gene chip manufacturer as representative of a 
particular mouse gene cluster (Unigene) . 

In most cases, this database DNA sequence is a full 

15 length genomic DNA or cDNA sequence, and is therefore either 
identical to, or otherwise encodes the same protein as does, 
a natural full-length genomic DNA protein coding sequence. 
Those which don't present at least a partial sequence of a 
natural gene or its cDNA equivalent. 

20 For the sake of simplicity, all of these mouse database 

DNA sequences, whether full-length or partial, and whether 
cDNA or genomic DNA, are referred to herein as "mouse genes". 
When only the genomic sequence is intended, we will refer 
specifically to "genomic DNA" or "gDNA" . 

2 5 The sequences in the protein databases are determined 

either by directly sequencing the protein or, more commonly, 
by sequencing a DNA, and then determining the translated 
amino acid sequence in accordance with the Genetic Code. All 
of the mouse sequences in the mouse polypeptide database are 

3 0 referred to herein as "mouse proteins" regardless of whether 

they are in fact full length sequences. 

Mouse genes which were differentially expressed (normal 
vs. hyperinsulinemic , hyperinsulinemic vs. diabetic, or 
normal vs. diabetic), as measured by different levels of 
35 hybridization of the respective cRNA samples with the 

particular probe corresponding to that mouse gene) were 
identified . 
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Since the progression is from normal to 
hyperinsulinemic, and thence from hyperinsulinemic to type 
II diabetic, one may define mammalian subjects as being more 
favored or less favored, with normal subjects being more 
5 favored than hyperinsulinemic subjects, and hyperinsulinemic 
subjects being more favored than type II diabetic subjects. 
The subjects' state may then be correlated with their gene 
expression activity. 

The terms "normal" and "control" are used 

10 interchangeably in this specification, unless expressly 

stated otherwise. The control or normal subject is a mouse 
which is normal vis-a-vis fasting insulin and fasting 
glucose levels. The term "normal", as used herein, means 
normal relative to those parameters, and does not 

15 necessitate that the mouse be normal in every respect. 

A mouse gene is said to have exhibited a favorable 
behavior if, for a particular mouse age of observation, its 
average level of expression in mice which are in a more 
favored state is higher than that in mice which are in a 

20 less favored state. A mouse gene is said to have exhibited 
an unfavorable behavior if, for a particular mouse age of 
observation, its average level of expression in mice which 
are in a more favored state is lower than that in mice which 
are in a less favored state. 

25 When we observe the mice at several different ages, it 

is possible for their expression behavior to vary from time 
point to time point. For a given comparison of subjects, 
e.g., normal vs. hyperinsulinemic, we classify the mouse 
gene as favorable or unfavorable on the basis of the 

30 direction of the largest expression change, and it is the 

magnitude of this largest expression change, expressed as a 
ratio of greater to lesser, which is set forth in the Master 
Table 1 data for that mouse gene. Thus, if at 2 weeks, there 
was a 3 -fold favorable behavior, and at 8 weeks, there was a 

35 4-fold unfavorable behavior, and at all other observed time 
points, the behavior was weaker than 3 -fold, the mouse gene 
would be classified as an unfavorable gene with respect to 
the subject comparison in question. 
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It will be appreciated that it may be that if the mouse 
gene were observed at an age other than one of the ages 
noted in the Examples, we would have observed a still 
stronger differential expression behavior. Nonetheless, we 
5 must classify the mouse genes on the basis of the behavior 
which we actually observed, not the behavior which might 
have been observed at some other age. 

We are particularly interested in mouse genes which 
10 exhibit strongly favorable or unfavorable differential 

expression behaviors. A behavior is considered strong if 
the ratio of the higher level to the lower level is at least 
two-fold . 

However, a mouse gene may still be identified as 
15 favorable or unfavorable even if none of its observed 
behaviors are strong as defined above. In general, we 
consider the consistency of its behaviors (that is, are all 
or most of the differential expression behaviors at 
different ages in the same direction, e.g., hyperinsulinemic 

2 0 higher than control) , the magnitude of the behaviors (higher 

the better) , and the expression behavior of structurally or 
functionally related mouse genes (a mouse gene is more 
likely to be identified as favorable on the basis of a 
weakly favorable behavior if it is related to other mouse 

25 genes which exhibited favorable, especially strongly 

favorable, behavior) . If we considered a mouse gene with 
only weak differential expression behavior to be worthy of 
consideration on the basis of these criteria, then we listed 
it in Master Table 1 in the appropriate subtable. 

30 Preferably, the differential behavior observed is both 

strong and consistent. Preferably, if related mouse genes 
were tested, they exhibit the same direction of differential 
expression behavior. 

3 5 A mouse gene which was more strongly expressed in 

hyperinsulinemic tissue than in either normal or type II 
diabetic tissue (i.e., C<HI, HI>D) will be deemed both 
"unfavorable", by virtue of the control : hyperinsulinemic 
comparison, and "favorable", by virtue of the 
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hyper insulinemic : diabetic comparison. This is one of several 
possible "mixed;' expression patterns. 

Thus, we can subdivide the "favorables" into wholly and 
partially favorables. Likewise, we can subdivide the 
5 unfavorables into wholly and partially unf avorables . The 
genes/proteins with "mixed" expression patterns are, by 
definition, both partially favorable and partially 
unfavorable. In general, use of the wholly favorable or 
wholly unfavorable genes/proteins is preferred to use of the 

10 partially favorable or partially unfavorable ones. 

It is evident from the foregoing that mixed 
genes/proteins are those exhibiting a combination of 
favorable and unfavorable behavior. A mixed gene/protein 
can be used as would a favorable gene/protein if its 

15 favorable behavior outweighs the unfavorable. It can be 

used as would an unfavorable gene/protein if its unfavorable 
behavior outweighs the favorable. Preferably, they are used 
in conjunction with other agents that affect their balance 
of favorable and unfavorable behavior . Use of mixed 

2 0 genes/proteins is, in general, less desirable than use of 
purely favorable or purely unfavorable genes/proteins, but 
it is not excluded. 

It should be noted that a mouse gene is classified on 
the basis of the strongest C-HI behavior among the ages 

25 tested, the strongest HI-D behavior among the ages tested, 
and the strongest C-D behavior among the ages tested. If at 
least one of these three behaviors is significantly 
favorable, and none of the others of these three behaviors 
is significantly unfavorable, the mouse gene will be 

30 classified as wholly favorable and listed in subtable 1A of 
Master Table 1. However, that does not mean that it may not 
have exhibited a weaker but unfavorable expression behavior 
at some tested age. 

The "favorable", "unfavorable" and "mixed" mouse 

35 proteins of the present invention include the mouse database 
proteins listed in the Master Table in the same row as a 
particular "favorable", "unfavorable" or "mixed" mouse gene, 
respectively. These proteins may be the exact translation 
product of the identified mouse gene (database DNA) . 
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However, if they were sequenced directly, they could be 
shorter or longer than that translation product . They could 
also differ in sequence from the exact translation product 
as a result of post-translational modifications. 
5 The mouse proteins of interest also include mouse 

proteins which, while not listed in the table, correspond to 
(i.e., homologous to, i.e., which could be aligned in a 
statistically significant manner to) such mouse proteins or 
genes, and mouse proteins which are at least substantially 
10 identical or conservatively identical to the listed mouse 
proteins . 

Related human genes (database DNAs) and proteins were 
identified by searching a database comprising human DNAs or 
15 proteins for sequences corresponding to (i.e., homologous 
to, i.e., which could be aligned in a statistically 
significant manner to) the mouse gene or protein. More than 
one human protein may be identified as corresponding to a 
particular mouse chip probe and to a particular mouse gene. 

2 0 Note that the terms "human genes" and "human proteins" 

are used in a manner analogous to that already discussed in 
the case of "mouse genes" and "mouse proteins". 

As used herein, the term "corresponding" does not mean 
identical, but rather implies the existence of a 
25 statistically significant sequence similarity, such as one 
sufficient to qualify the human protein or gene as a 
homologus protein or DNA as defined below. The greater the 
degree of relationship as thus defined (i.e., by the 
statistical significance of each alignment used to connept 

3 0 the mouse cDNA to the human protein or gene, measured by an 

E value), the more close the correspondence. The connection 
may be direct (mouse gene to human protein) or indirect 
(e.g., mouse gene to human gene, human gene to human 
protein) .By "mouse gene", we mean the mouse gene from which 
3 5 the gene chip DNA in question was derived. 

In general, the human genes/proteins which most closely 
correspond, directly or indirectly, to the mouse genes are 
preferred, such as the one(s) with the highest, top two 
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highest, top three highest, top four highest, top five 
highest, and top ten highest E values for the final 
alignment in the connection process. The human 
genes/proteins deemed to correspond to our mouse genes are 
5 identified in the Master Tables. 

Note that it is possible to identify homologous full- 
length human genes and proteins, if they are present in the 
database, even if the query mouse DNA or protein sequence is 
not a full-length sequence. 

10 If there is no homologous full-length human gene or 

protein in the database, but there is a partial one, the 
latter may nonetheless be useful. For example, a partial 
protein may still have biological activity, and a molecule 
which binds the partial protein may also bind the full- 

15 length protein so as to antagonize a biological activity of 
the full-length protein. Likewise, a partial human gene may 
encode a partial protein which has biological activity, or 
the gene may be useful in the design of a hybridization 
probe or in the design of a therapeutic ant i sense DNA. 

2 0 The partial genes and protein sequences may of course 

also be used in the design of probes intended to identify 
the full length gene or protein sequence. 

For the sake of convenience, we refer to a human 
25 protein as favorable if (1) it is listed in Master Table 1 
as corresponding to a favorable mouse gene, or (2) it is at 
least substantially identical or conservatively identical to 
a listed protein per (1) , or (3) it is a member of a human 
protein class listed in Master Table 2 (if provided) as 
30 corresponding to a favorable mouse gene. We define a human 
protein as unfavorable in an analogous manner. We may 
further identify a human protein as being wholly favorable 
(see mouse genes of subtable 1A, wholly unfavorable (see 
mouse genes of subtable IB), or mixed, i.e., both partially 
35 favorable and partially unfavorable ( see mouse genes of 
subtable 1C) . 

Likewise, a human gene which encodes a particular human 
protein may be classified in the same way as the human 
protein which it encodes. 
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However, it should be noted that this classification is 
not based on the direct study of the expression of the human 
gene/protein. of course, the human genes/proteins of 
ultimate interest will be the ones whose change in level of 
5 expression is, in fact, correlated, directly or inversely, 
with the change of state (normal, hyperinsulinemic , 
diabetic) of the subject. 

After identifying related human genes and proteins, one 
10 may formulate agents useful in screening humans at risk for 
progression toward hyperinsulinemia or toward type II 
diabetes, or protecting humans at risk thereof from 
progression from a normoinsulinemic state to a 
hyperinsulinemic state, or from either to a type II diabetic 
15 state. 

Agents which bind the "favorable" and "unfavorable" 
nucleic acids (e.g., the agent is a substantially 
complementary nucleic acid hybridization probe) , or the 

20 corresponding proteins (e.g., an antibody vs. the protein) 
may be used to evaluate whether a human subject is at 
increased or decreased risk for progression toward type II 
diabetes. A subject with one or more elevated "unfavorable" 
and/or one or more depressed "favorable" genes/proteins is 

2 5 at increased risk, and one with one or more elevated 

"favorable" and/or one or more depressed "unfavorable" 
genes/proteins is at decreased risk. One may further take 
into account whether the subject is normoinsulinemic or 
hyperinsulinemic at the time of the assay. If the subject 

30 is non-diabetic and normoinsulinemic, we are especially 
interested in the "favorable" and "unfavorable" human 
genes/proteins corresponding to mouse genes differentially 
expressed in hyperinsulinemic vs. normal muscle. If the 
subject is already hyperinsulinemic, yet non-diabetic, we 

35 are especially interested in the "favorable" and 

"unfavorable" human genes/proteins corresponding to mouse 
genes differentially expressed in type II diabetic vs. 
hyperinsulinemic muscle. 
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The assay may be used as a preliminary screening assay 
to select subjects for further analysis, or as a formal 
diagnostic assay. 



5 The identification of the related genes and proteins 

may also be useful in protecting humans against these 
disorders. 

Thus, Applicants contemplate: 

(1) use of the "favorable" mouse DNAs (or fragments 

10 thereof) of the Master Tables (below) to isolate or identify 
related human DNAs; 

(2) use of human DNAs, related to favorable mouse DNAs, 
to express the corresponding human proteins ; 

(3) use of the corresponding human proteins (and mouse 
15 proteins,, if biologically active in humans), to protect 

against the disorder (s) ; 

(4) use of the corresponding mouse or human proteins, 
or nucleic acid probes derived from the mouse or human 
genes, in diagnostic agents, in assays to measure 

2 0 progression toward hyperinsulinemia or type II diabetes, or 

protection against the disorder (s) , or to estimate related 
end organ damage such as kidney damage ; and 

(5) use of the corresponding human or mouse genes 
therapeutically in gene therapy, to protect against the 

25 disorder (s) . 

Moreover Applicants contemplate: 

(1) use of the "unfavorable" mouse DNAs (or fragments 
thereof) of the Master Tables to isolate or identify related 
human DNAs ; 

3 0 (2) use of the complement to the "unfavorable" mouse 

DNAs or related human DNAs, as antisense molecules to 
inhibit expression of the related human DNAs; 

(3) use of the mouse or human DNAs to express the 
corresponding mouse or human proteins; 
35 (4) use of the corresponding mouse or human proteins, 

in diagnostic agents, to measure progression toward 
hyperinsulinemia or type II diabetes, or protection against 
the disorder (s), or to estimate related end organ damage 
such as kidney damage; 
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(5) use of the corresponding mouse or human proteins in 
assays to determine whether a substance binds to (and hence 
may neutralize) the protein; and 

(6) use of the neutralizing substance to protect 
5 against the disorder (s). 

Thus, DNAs of interest include those which specifically 
hybridize to the aforementioned mouse or human genes, and 
are thus of interest as hybridization assay reagents or for 

10 antisense therapy. They also include synthetic DNA sequences 
which encode the same polypeptide as is encoded by the 
database DNA, and thus are useful for producing the 
polypeptide in cell culture or in situ (i.e., gene therapy). 
Moreover, they include DNA sequences which encode 

15 polypeptides which are substantially structurally identical 
or conservatively identical in amino acid sequence to the 
mouse and human proteins identified in 'the Master Table 1, 
subtables 1A or 1C. Finally, they include DNA sequences , 
which encode peptide (including antibody) antagonists of the 

2 0 proteins of Master Table 1, subtables IB or 1C. 

The related human DNAs may be identified by comparing 
the mouse sequence (or its AA translation product) to known 
human DNAs (and their AA translation products) . 
25 Related human DNAs also may be identified by screening 

human cDNA or genomic DNA libraries using the mouse gene of 
the Master Table, or a fragment thereof, as a probe. 
If the mouse gene of Master Table 1 is not full-length, and 
there is no closely corresponding full-length mouse gene in 

3 0 the sequence databank, then the mouse DNA may first be used 

as a hybridization probe to screen a mouse cDNA library to 
isolate the corresponding full-length sequence. 
Alternatively, the mouse DNA may be used as a probe to 
screen a mouse genomic DNA library. 
35 Our animal models of hyperinsulinemia and diabetes are 

also obese. It is possible that the genes found to be 
favorable act indirectly by inhibiting obesity. Likewise, it 
is possible that the genes found to be unfavorable act. 
indirectly by accentuating obesity. Consequently, it is 
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within the compass of the present invention to use the 
favorable genes and proteins, or to use antagonists of the 
unfavorable genes and proteins, to protect against obesity, 
as well as against sequelae of obesity such as 
hyper insulinemia and diabetes. 

Since type II diabetes is an age-related disease, the 
agents of the present invention may be used in conunction 
with known anti -aging or anti-age-related disease agents. 
It is of particular interest to use the agents of the 
present invention in conjunction with an agent disclosed in 
one of the related applications cited above, in particular, 
an antagonist to CIDE-A, the latter having been taught in 
Kopchick7 and Kopchick7A-PCT. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1. Body weight gain [Fig. la] , fasting glucose [Fig. 
lb] and fasting insulin [Fig. lc] levels of mice on the HF 
or Std diets. 

5 

Figure 2. Expression levels of Actin, alpha, cardiac 
(Actcl, NM_009608) using RNA isolated from gastrocnemius 
muscle of individual diabetic HF mice and corresponding Std 
mice at different time points. 

10 

Figure 3. Data shown are expression levels for additional 
actin-related and act in-binding genes exhibiting a 
consistent decrease in expression in the HF mice in 
comparison to Std mice at all four time points (Fig. 3(a)) 
15 or at three of the four time points (Fig. 3(b)) . 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF THE 
INVENTION 



5 Full-Length vs. Partial Length Genes/Proteins 

A "full length" gene is here defined as (1) a 
naturally occurring DNA sequence which begins with an 
initiation codon (almost always the Met codon, ATG) , and 
ends with a stop codon in phase with said initiation codon 

10 (when introns, if any, are ignored) , and thereby encodes a 

naturally occurring polypeptide with biological activity, or 
a naturally occurring precursor thereof, or (2) a synthetic 
DNA sequence which encodes the same polypeptide as that 
which is encoded by (1) . The gene may, but need not, 

15 include introns. 

A "full-length" protein is here defined as a 
naturally occurring protein encoded by a full-length gene, 
or a protein derived naturally by post-translational 
modification of such a protein. Thus, it includes mature 

2 0 proteins, proproteins, preproteins and preproproteins . It 
also includes substitution and extension mutants of such 
naturally occurring proteins. 

Subjects 

25 A mouse is considered to be a diabetic subject if, 

regardless of its fasting plasma insulin level, it has a 
fasting plasma glucose level of at least 190 mg/dL. A mouse 
is considered to be a hyperinsulinemic subject if its 
fasting plasma insulin level is at least 0.67 ng/mL and it 

30 does not qualify as a diabetic subject. A mouse is 

considered to be "normal" if it is neither diabetic nor 
hyperinsulinemic. Thus, normality is defined in a very 
limited manner. 

A mouse is considered "obese" if its weight is at least 

35 15% in excess of the mean weight for mice of its age and 
sex. A mouse which does not satisfy this standard may be 
characterized as "non-obese", the term "normal" being 
reserved for use in reference to glucose and insulin levels 
as previously described. 
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A human is considered a diabetic subject if, regardless 
of his or her fasting plasma insulin level, the fasting 
plasma glucose level is at least 126 mg/dL. A human is 
considered a hyper insulinemic subject if the fasting plasma 
5 insulin level is more than 26 micro International Units/mL 
(it is believed that this is equivalent to 1.08 ng/mL) , and 
does not qualify as a diabetic subject. A human is 
considered to be "normal" if it is neither diabetic nor 
hyperinsulinemic . Thus, normality is defined in a very 

10 limited manner. 

A human is considered "obese" if the body mass index 
(BMI) (weight divided by height squared) is at least 3 0 
kg/m 2 . A human who does not satisfy this standard may be 
characterized as "non-obese", the term "normal" being 

15 reserved for use in reference to glucose and insulin levels 
as previously described. 

A human is considered overweight if the BMI is at least 
25 kg/m 2 . Thus, we define overweight to include obese 
individuals, consistent with the recommendations of the 

20 National Institute of Diabetes and Digestive and Kidney 

Diseases (NIDDK) . A human who does not satisfy this standard 
may be characterized as "non- overweight . " 

According to the Report of the Expert Committe on the 
25 Diagnosis and Classification of Diabetes Mellitus, Diabetes 
Care 20: 1183-97 (1997), the following are risk factors for 
diabetes type II: 

older (e.g., at least 45; see below) 

30 

excessive weight (see below) 

first-degree relative with diabetes mellitus 

35 member of high risk ethnic group (black, Hispanic, 

Native American, Asian) 



history of gestational diabetes mellitus or delivering 
a baby weighing more than 9 pounds (4.032 kg) 
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hypertensive (>140/90 mm Hg) 



HDL cholesterol level >35 mg/dL (0.90 mmol/L) 

5 triglyceride level >=250 mg/dL (2.83 mmol/L) 

Hence, in a preferred embodiment, the diagnostic and 
protective methods of the present invention are applied to 
human subjects exhibiting one or more of the aforementioned 
10 risk factors. Likewise, in a preferred embodiment, they are 
applied to human subjects who, while not diabetic, exhibit 
impaired glucose homeostasis (110 to <126 mg/dL) . 

The risk of diabetes increases with age. Hence, in 
15 successive preferred embodiments, the age of the subjects is 
at least 45, at least 50, at least 55, at least 60, at least 
65, at least 70, and at least 75. 

With regard to excessive weight, NIDDK says that "The 
relative risk of diabetes increases by approximately 2 5 
20 percent for each additional unit of BMI over 22." Hence, in 
successive preferred embodiments, the BMIs of the human 
subjects is at least 23, at least 24, at least 25 (i.e., 
overweight by our criterion), at least 26, at least 27, at 
least 28, at least 29, at least 30 (i.e., obese), at least 
25 31, at least 32, at least 33, at least 34, at least 35, at 
least 36, at least 37, at least 38, at least 39, at least 
40, or over 40. 

Age-Related Diseases 

30 

Age-related (senescent) diseases include certain 
cancers, atherosclerosis, diabetes (type 2), osteoporosis, 
hypertension, depression, Alzheimer's, Parkinson's, glaucoma, 
certain immune system defects, kidney failure, and liver 
35 steatosis. In general, they are diseases for which the 

relative risk (comparing a subpopulat ion over age 55 to a 
suitably matched population under age 55) is at least 1.1. 
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Preferably, the agents of the present invention protect 
against one or more age-related diseases for at least a 
subpopulation of mature (post -puberty) adult subjects. 



Direct and Indirect Utility of Identified Nucleic Acid 
Sequences and Related Molecules 

The mouse or human genes (or fragments thereof) may be 

10 used directly. For diagnostic or screening purposes, they 
(or specific binding fragments thereof) may be labeled and 
used as hybridization probes. For therapeutic purposes, 
they (or specific binding fragments thereof) may be used as 
antisense reagents to inhibit the expression of the 

15 corresponding gene, or of a sufficiently homologous gene of 
another species. 

If the database DNA appears to be a full-length cDNA 
or gDNA, that is, it encodes an entire, functional, 
naturally occurring protein, then it may be used in the 

2 0 expression of that protein. Likewise, if the corresponding 
human gene is known in full-length, it may be used to 
express the human protein. Such expression may be in cell 
culture, with the protein subsequently isolated and 
administered exogenously to subjects who would benefit 

25 therefrom, or in vivo, i.e., administration by gene therapy. 

Naturally, any DNA encoding the same protein may be used for 
the same purpose, and a DNA encoding a protein which a 
fragment or a mutant of that naturally occurring protein 
which retains the desired activity, may be used for the 

30 purpose of producing the active fragment or mutant. The 

encoded protein of course has utility therapeutically and, 
in labeled or immobilized form, diagnost ically . 

The genes may also be used indirectly, that is, to 
identify other useful DNAs, proteins, or other molecules. 

35 We have attempted to determine whether the mouse genes 

disclosed herein have significant similarity to any known 
human DNA, and whether, in any of the six possible 
combinations of reference frame and strand, they encode a 
protein similar to a known human protein. If so, then it 
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follows that the known human protein, and DNAs encoding that 
protein, may be used in a similar manner. In addition, if . 
the known human protein is known to have additional 
homologues, then those homologous proteins, and DNAs 
encoding them, may be used in a similar manner. 

There thus are several ways that a human protein 
homologue of interest can be identified by database 
searching, including but not limited to: 

1) a DNA- >DNA (BlastN) search for human database DNAs 
closely related to the mouse gene identifies a known human 
gene, and the sequence of the human protein is deduced by 
the Genetic Code; 

2) a DNA->Protein (BlastX) search for humn database proteins 
closely related to the translated DNA of the mouse gene 
identifies a known human protein; and 

3) the sequence of the mouse protein is known or is deduced 
by the Genetic Code, and a Protein- >Protein (BlastP) search 
for closely related database proteins identifies a known 
human protein. 

Once a known human gene is identified, it may be used in 
further BlastN or BlastX searches to identify other human 
genes or proteins. Once a known human protein is 
identified, it may be used in further BlastP searches to 
identify other human proteins. 

Searches may also take cognizance, intermediately, of known 
genes and proteins other than mouse or human ones, e.g., use 
the mouse sequence to identify a known rat sequence and then 
the rat sequence to identify a human one. 

If we have identified a mouse gene, and it encodes a 
mouse protein which appears similar to a human protein, then 
that human protein may be used (especially in humans) for 
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purposes analogous to the proposed use of the mouse protein 
in mice. Moreover, a specific binding fragment of an 
appropriate strand of the corresponding human gene (gDNA or 
cDNA) could be labeled and used as a hybridization probe 
5 (especially against samples of human mRNA or cDNA) . 

In determining whether the disclosed genes (gDNA or 
cDNA) have significant similarities to known DNAs (and their 
translated AA sequences to known proteins) , one would 
generally use the disclosed gene as a query sequence in a 

10 search of a sequence database. The results of several such 
searches are set forth in the Examples. Such results are 
dependent, to some degree, on the search parameters. 
Preferred parameters are set forth in Example 1. The 
results are also dependent on the content of the database. 

15 While the raw similarity score of a particular target 

(database) sequence will not vary with content (as long as 
it remains in the database) , its informational value (in 
bits), expected value, and relative ranking can change. 
Generally speaking, the changes are small. 

20 

It will be appreciated that the nucleic acid and 
protein databases keep growing. Hence a later search may 
identify high scoring target sequences which were not 

2 5 uncovered by an earlier search because the target sequences 
were not previously part of a database . 

Hence, in a preferred embodiment, the cognate DNAs and 
proteins include not only those set forth in the examples, 
but those which would have been highly ranked (top ten, more 

30 preferably top three, even more preferably top two, most 
preferably the top one) in a search run with the same 
parameters on the date of filing of this application. 

If the known mouse or human database DNA appears to be 
35 a partial sequence (that is, partial relative to a cDNA or 

gDNA encoding the whole naturally occurring protein) , it may 
be used as a hybridization probe to isolate the full-length 
DNA. If the partial DNA encodes a biologically functional 
fragment of the cognate protein, it may be used in a manner 
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similar to the full length DNA, i.e., to produce the 
functional fragment. 

If we have indicated that an antagonist of a protein or 
5 other molecule is useful, then such an antagonist may be 

obtained by preparing a combinatorial library, as described 
below, of potential antagonists, and screening the library 
members for binding to the protein or other molecule in 
question. The binding members may then be further screened 

10 for the ability to antagonize the biological activity of the 
target. The antagonists may be used therapeutically, or, in 
suitably labeled or immobilized form, diagnostically . 

If the identified mouse or human database DNA is 
related to a known protein, then substances known to 

15 interact with that protein (e.g., agonists, antagonists, 

substrates, receptors, second messengers, regulators, and so 
forth) , and binding molecules which bind them, are also of 
utility. Such binding molecules can likewise be identified 
by screening a combinatorial library. 

20 

Isolation of Full Length DNAs Using Partial DNAs as probes 

If it is determined that a DNA of the present invention 
is a partial DNA, and the cognate full length DNA is not 
listed in a sequence database, the available DNA may be used 
25 as a hybridization probe to isolate the full-length DNA from 
a suitable DNA library. 

Stringent hybridization conditions are appropriate, 
. that is, conditions in which the hybridization temperature 
is 5-10 deg. C. below the Tm of the DNA as a perfect duplex. 

30 

Identification and Isolation of Homologous Genes Using a DNA 
Probe 

It may be that the sequence databases available do not 
include the sequence of any homologous gene (cDNA or gDNA) , 
3 5 or at least of the homologous gene for a species of 

interest. However, given the cDNAs set forth above, one may 
readily obtain the homologous gene. 

The possession of one DNA (the "starting DNA") greatly 
facilitates the isolation of homologous DNAs. If only a 
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partial DNA is known, this partial DNA may first be used as 
a probe to isolate the corresponding full length DNA for the 
same species, and that the latter may be used as the 
starting DNA in the search for homologous genes. 
5 The starting DNA, or a fragment thereof, is used as a 

hybridization probe to screen a cDNA or genomic DNA library 
for clones containing inserts which encode either the entire 
homologous protein, or a recognizable fragment thereof. The 
minimum length of the hybridization probe is dictated by the 

10 need for specificity. If the size of the library in bases 
is L, and the GC content is 50%, then the probe should have 
a length of at least 1, where L = 4 1 . This will yield, on 
average, a single perfect match in random DNA of L bases. 
The human cDNA library is about 10 8 bases and the human 

15 genomic DNA library is about 10 10 bases. 

The library is preferably derived from an organism 
which is known, on biochemical evidence, to produce a 
homologous protein, and more preferably from the genomic DNA 
or mRNA of cells of that organism which are likely to be 

20 relatively high producers of that protein. A cDNA library 
(which is derived from an mRNA library) is especially 
preferred . 

If the organism in question is known to have 
substantially different codon preferences from that of the 

25 organism whose relevant cDNA or genomic DNA is known, a 

synthetic hybridization probe may be used which encodes the 
same amino acid sequence but whose codon utilization is more 
similar to that of the DNA of the target organism. 
Alternatively, the synthetic probe may employ inosine as a 

30 substitute for those bases which are most likely to be 

divergent, or the probe may be a mixed probe which mixes the 
codons for the source DNA with the preferred codons 
(encoding the same amino acid) for the target organism. 
By routine methods, the Tm of a perfect duplex of 

35 starting DNA is determined. One may then select a 

hybridization temperature which is sufficiently lower than 
the perfect duplex Tm to allow hybridization of the starting 
DNA (or other probe) to a target DNA which is divergent from 
the starting DNA. A 1% sequence divergence typically lowers 
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the Tm of a duplex by 1-2°C, and the DNAs encoding 
homologous proteins of different species typically have 
sequence identities of around 50-80%. Preferably, the 
library is screened under conditions where the temperature 
5 is at least 20°C, more preferably at least 50°C, below the 
perfect duplex Tm. Since salt reduces the Tm, one 
ordinarily would carry out the search for DNAs encoding 
highly homologous proteins under relatively low salt 
hybridization conditions, e.g., <1M NaCl . The higher the 

10 salt concentration, and/or the lower the temperature, the 
greater the sequence divergence which is tolerated. 

For the use of probes to identify homologous genes in 
other species, see, e.g., Schwinn, et al . , J. Biol. Chem. , 
265:8183-89 (1990) (hamster 67-bp cDNA probe vs. human 

15 leukocyte genomic library; human 0.32kb DNA probe vs. bovine 
brain cDNA library, both with hybridization at 42°C in 
6xSSC) ; Jenkins et al . , J. Biol. Chem., 265:19624-31 (1990) 
(Chicken 770 -bp cDNA probe vs. human genomic libraries; 
hybridization at 40 °C in 50% formamide and 5xSSC) ; Murata et 

20 al . , J. Exp. Med., 175:341-51 (1992) (1.2-kb mouse cDNA 
probe v. human eosinophil cDNA library; hybridization at 
65°C in 6xSSC) ; Guyer et al . , J. Biol. Chem., 265:17307-17 
(1990) (2.95-kb human genomic DNA probe vs. porcine genomic 
DNA library; hybridization at 4 2°C in 5xSSC) . The 

25 conditions set forth in these articles may each be 

considered suitable for the purpose of isolating homologous 
genes . 

Corresponding (Homologous) Proteins and DNAs 

3 0 In the case of a gene chip, the manufacturer of the gene 

chip determines which DNA to place at each position on the 
chip. This DNA may correspond in sequence to a genomic DNA, 
a cDNA, or a fragment of genomic or cDNA, and may be 
natural, synthetic or partially natural and partially 

3 5 synthetic in origin. The manufacturer of the gene chip will 
normally identify the DNA for a mouse gene chip as 
corresponding to a particular mouse gene, in which case it 
will be assumed that the alignments of chip DNA to mouse 
gene satisfies the homology criteria of the invention. 
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Usually, the gene chip manufacturer will provide a sequence 
database accession number for the mouse DNA. If so, to 
identify the corresponding mouse protein, we will first 
inspect the database record for that mouse DNA. Often, the 
5 mouse protein accession number will appear in that record or 
in a linked record. If it doesn't, the corresponding mouse 
protein can be identified by performing a BlastX search on a 
mouse protein database with the mouse database DNA sequence 
as the query sequence. Even if the protein sequence is not 
10 in the database, if the DNA sequence comprises a full-length 
coding sequence, the corresponding protein can be identified 
by translating the coding sequence in accordance with the 
Genetic Code. 

15 A human protein can be said to be identifiable as 

corresponding (homologous) to a gene chip DNA if it is 
identified as corresponding (homologous) to the mouse gene 
(gDNA or cDNA, whole or partial) identified by the gene chip 
manufacturer as corresponding to that gene chip DNA. 

20 

In turn, it is identifiable as corresponding 
(homologous) to said identified mouse gene, if 

(1) it can be aligned by BlastX directly to that mouse gene, 
25 and/or 

(2) it is encoded by a human gene, or can be aligned to a 
human gene by BlastX, which in turn can be aligned by BlastN 
to said mouse gene and/or 

30 

(3) it can be aligned by Blast P to a mouse protein, the 

latter being encoded by said mouse gene, or aligned to said 
mouse gene BlastX, 

35 where any alignment by BlastN, BlastP or BlastX is in 

accordance with the default parameters set forth below, and 
the expected value (E) of each alignment (the probability 
that such an alignment would have occurred by chance alone) 




is less than e-10. (Note that 
exponent, a value such as e-5 
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because this is a negative 
0 is less than e-10.) 



Desirably, two or all three of these conditions (1) - (3) are 
5 satisfied for the corresponding (homologous) human genes and 
proteins . 

A human gene is corresponding (homologous) to a mouse 
gene chip DNA, and hence to said identified mouse gene (or 

10 cDNA) and protein, if it encodes a corresponding 

(homologous) human protein as defined above, or it can be 
aligned by BlastN to said mouse gene. 

Preferably, for at least one of conditions (1) - (3) , the 
E value is less than e-50, more preferably less than e-60, 

15 still more preferably less than e-70, even more preferably 

less than e-80, considerably more preferably less than e-90, 
and most preferably less than e-100. Desirably, it is true 
for two or even all three of these conditions. 

2 0 In constructing Master table 1, we generally used a 

BlastX (mouse gene vs. human protein) alignment E value 
cutoff of e-50. However, if there were no human proteins 
with that good an alignment to the mouse DNA in question, or 
if there were other reasons for including a particular human 

25 protein (e.g., a known functionality supportive of the 

observed differential cognate mouse protein expression) , 
then a human protein with a score worse (i.e., higher) than 
e-50 may appear in Master Table 1. 

30 If the manufacturer of the gene chip identifies the 

gene chip DNA as corresponding to an EST, or other DNA which 
is not a full-length mouse gene or cDNA, a longer (possibly 
full length) mouse gene or cDNA may be identified by a 
BlastN search of the mouse DNA database. Alternatively, the 

35 identified DNA may be used to conduct a BlastN search of a 
human DNA database, or a BlastX search of a mouse or human 
protein database. 

Thus, more generally, a human protein can be said to be 
identifiable as corresponding (homologous) to a gene chip 
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DNA, or to a DNA identified by the manufacturer as 
corresponding to that gene chip DNA, if 



(1') it can be aligned directly to the gene chip or 
5 corresponding manufacturer identified DNA by BlastX. and/or 

(2 1 ) it can be aligned to a human gene/cDNA by BlastX, whose 
genomic DNA (gDNA) or cDNA (DNA complementary to messenger 
RNA) in turn can be aligned to the gene chip or 
10 corresponding manufacturer identified DNA by BlastN, and/or 

(3 1 ) it can be aligned to a mouse gene/cDNA by BlastX, whose 
gDNA or cDNA in turn can be aligned to the gene chip or 
corresponding manufacturer identified DNA by BlastN, and/or 

15 

(4 1 ) it can be aligned to a mouse protein by BlastP, which 
in turn can be aligned to the gene chip or corresponding 
manufacturer identified DNA by BlastX, and/or 

20 (5') it can be aligned to a mouse protein by BlastP, which 

in turn can be aligned to a mouse gene/cDNA by BlastX, whose 
gDNA or cDNA can in turn be aligned to the gene chip or 
corresponding manufacturer identified DNA by BlastN; 

25 where any alignment by BlastN, BlastP, or BlastX is in 

accordance with the default parameters set forth below, and 
the expected value (E) of each alignment (the probability 
that such an alignment would have occurred by chance alone) 
is less than e-10. (Note that because this is a negative 

30 exponent, a value such as e-50 is less than e-10.) 

Preferably, two, three, four or all five of conditions 
(l')-(5') are satisfied. 

Preferably, for at least one of conditions (l')-(5'), 
35 for at least the final alignment (i.e., vs. the human 

protein), the E value is less than e-50, more preferably 
less than e-60, , still more preferably less than e-70, even 
more preferably less than e-80, considerably more preferably 
less than e-90, and most preferably less than e-100. 
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Desirably, one or more of these standards of preference 
are met for two, three, four or all five of conditions (1')- 
(5 1 ) • In particular, for those conditions in which the gene 
chip or corresponding manufacturer identified DNA is 
5 indirectly connected to the human protein by virtue of two 

or more successive alignments, the E value is preferably, so 
limited for all of said alignments in the connecting chain. 

i 

A human gene corresponds (is homologous) to a gene chip 
10 DNA or manufacturer identified corresponding DNA if it 

encodes a homologous human protein as defined above, or if 
it can be aligned either directly to that DNA, or indirectly 
through a mouse gene which can be aligned to said DNA, 
according to the conditions set forth above. 

15 

Master table 1 assembles a list of human protein 
corresponding to each of the mouse DNAs/proteins identified 
as related to the chip DNA. These human proteins form a set 
and can be given a percentile rank, with respect to E value, 
20 within that set. The human proteins of the present 

invention preferably are those scorers with a percentile 
rank of at least 50%, more preferably at least 60%, still 
more preferably at least 70%, even more preferably at least 
80%, and most preferably at least 90%. 

25 

For each mouse gene (gDNA or cDNA) in Master Table 1, 
there is a particular human protein which provides the best 
alignment match as measured by BlastX, i.e., the human 
protein with the best score (lowest e-value) . These human 

3 0 proteins form a subset of the set above and can be given a 

percentile rank within that subset, e.g., the human proteins 
with scores in the top 10% of that subset have a percentile 
rank of 90% or higher. 

The human proteins of the present invention preferably 

35 are those best scorer subset proteins with a percentile rank 
within the subset of at least 50%, more preferably at least 
60%, still more preferably at least 70%, even more 
preferably at least 80%, and most preferably at least 90%. 
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BlastN and BlastX report very low expected values as 
"0.0". This does not truly mean that the expected value is 
exactly zero (since any alignment could occur by chance) , 
but merely that it is so infinitesimal that it is not 
5 reported. The documentation does not state the cutoff 

value, but alignments with explicit E values as low as e-178 
(624 bits) have been reported as nonzero values, while a 
score of 636 bits was reported as "0.0". 

10 Functionally homologous human proteins are also of 

interest. A human protein may be said to be functionally 
homologous to the mouse gene if the human protein has at 
least one biological activity in common with the mouse 
protein encoded by said mouse gene. 

15 The human proteins of interest also include those that 

are substantially and/or conservatively identical (as 
defined below) to the homologous and/or functionally 
homologous human proteins defined above. 

20 Degree of Differential Expression 

The degree of differential expression may be expressed 
as the ratio of the higher expression level to the lower 
expression level. Preferably, this is at least 2-fold, and 
more preferably, it is higher, such as at least 3 -fold, at 
25 least 4-fold, at least 5-fold, at least 6-fold, at least 7- 
fold, at least 8-fold, at least 9-fold, or at least 10-fold. 

Most preferably, the human protein of interest corresponds 
to a mouse gene for which the degree of differential 
30 expression places it among the top 10% of the mouse genes in 
the appropriate subtable. 



35 

Relevance of Favorable and Unfavorable Genes 
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If a gene is down -regulated in more favored mammals, or 
up-regulated in less favored mammals, (i.e., an "unfavorable 
gene") then several utilities are apparent. 

First, the complementary strand of the gene, or a 
5 portion thereof, may be used in labeled form as a 

hybridization probe to detect messenger RNA and thereby 
monitor the level of expression of the gene in a subject. 
Elevated levels are indicative of progression, or 
propensity to progression, to a less favored state, and 

10 clinicians may take appropriate preventative, curative or 
ameliorative action. 

Secondly, the messenger RNA product (or equivalent 
cDNA) , the protein product, or a binding molecule specific 
for that product (e.g., an antibody which binds the 

15 product) , or a downstream product which mediates the 

activity (e.g., a signaling intermediate) or a binding 
molecule (e.g., an antibody) therefor, may be used, 
preferably in labeled or immobilized form, as an assay 
reagent in an assay for said nucleic acid product, protein 

20 product, or downstream product (e.g., a signaling 

intermediate) . Again, elevated levels are indicative of a 
present or future problem. 

Thirdly, an agent which down- regulates expression of 
the gene may be used to reduce levels of the corresponding 

25 protein and thereby inhibit further damage. This agent 

could inhibit transcription of the gene in the subject, or 
translation of the corresponding messenger RNA. Possible 
inhibitors of transcription and translation include 
antisense molecules and repressor molecules. The agent 

30 could also inhibit a post-translational modification (e.g., 
glycosylation, phosphorylation, cleavage, GPI attachment) 
required for activity, or post - translationally modify the 
protein so as to inactivate it. Or it could be an agent 
which down- or up- regulated a positive or negative 

35 regulatory gene, respectively. 

Fourthly, an agent which is an antagonist of the 
messenger RNA product or protein product of the gene, or of 
a downstream product through which its activity is 
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manifested (e.g., a signaling intermediate), may be used to 
inhibit its activity. 

This antagonist could be an antibody, a peptide, a 
peptoid, a nucleic acid, a peptide nucleic acid (PNA) 
5 oligomer, a small organic molecule of a kind for which a 

combinatorial library exists (e.g., a benzodiazepine), etc. 
An antagonist is simply a binding molecule which, by 
binding, reduces or abolishes the undesired activity of its 
target. The antagonist, if not an oligomeric molecule, is 

10 preferably less than 1000 daltons, more preferably less than 
500 daltons. 

Fifthly, an agent which degrades, or abets the 
degradation of, that messenger RNA, its protein product or a 
downstream product which mediates its activity (e.g., a 

15 signaling intermediate) , -may be used to curb the effective 
period of activity of the protein. 

If a gene is up-regulated in more favored mammals, or 
down - regulated in less favored animals then the utilities 
are converse to those stated above . 

2 0 First, the complementary strand of the gene, or a 

portion thereof, may be used in labeled form as a 
hybridization probe to detect messenger RNA and thereby 
monitor the level of expression of the gene in a subject. 
Depressed levels are indicative of damage, or possibly of a 

25 propensity to damage, and clinicians may take appropriate 
preventative, curative or ameliorative action. 

Secondly, the messenger RNA product, the equivalent 
cDNA, protein product, or a binding molecule specific for 
those products, or a downstream product, or a signaling 

30 intermediate, or a binding molecule therefor, may be used, 
preferably in labeled or immobilized form, as an assay 
reagent in an assay for said protein product or downstream 
product. Again, depressed levels are indicative of a 
present or future problem. 

35 Thirdly, an agent which up-regulates expression of the 

gene may be used to increase levels of the corresponding 
protein and thereby inhibit further progression to a less 
favored state. By way of example, it could be a vector 
which carries a copy of the gene, but which expresses the 
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gene at higher levels than does the endogenous expression 
system. Or it could be an agent which up- or down- regulates 
a positive or negative regulatory gene. 

Fourthly, an agent which is an agonist of the protein 
5 product of the gene, or of a downstream product through 

which its activity (of inhibition of progression to a less 
favored state) is manifested, or of a signaling intermediate 
may be used to foster its activity. 

Fifthly, an agent which inhibits the degradation of 
10 that protein product or of a downstream product or of a 

signaling intermediate may be used to increase the effective 
period of activity of the protein. 

15 Mutant Proteins 

The present invention also contemplates mutant proteins 
(peptides) which are substantially identical (as defined 
below) to the parental protein (peptide) . In general, the 
fewer the mutations, the more likely the mutant protein is 
20 to retain the activity of the parental protein. The effect 
of mutations is usually (but not always) additive. Certain 
individual mutations are more likely to be tolerated than 
others . 

A protein is more likely to tolerate a mutation which 
25 (a) is a substitution rather than an insertion or 

deletion; 

(b) is an insertion or deletion at the terminus, 
rather than internally, or, if internal, is at a 
domain boundary, or a loop or turn, rather than in 

30 an alpha helix or beta strand; 

(c) affects a surface residue rather than an 
interior residue; 

(d) affects a part of the molecule distal to the 
binding site; 

35 (e) is a substitution of one amino acid for 

another of similar size, charge, and/or 
hydrophobic! ty, and does not destroy a disulfide 
bond or other crosslink; and 
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(f) is at a site which is subject to substantial 
variation among a family of homologous proteins to 
which the protein of interest belongs. 
These considerations can be used to design functional 
5 mutants . 

Surface vs. Interior Residues 

Charged amino acid residues almost always lie on the 
surface of the protein. For uncharged residues, there is 
10 less certainty, but in general, hydrophilic residues are 

partitioned to the surface and hydrophobic residues to the 
interior. Of course, for a membrane protein, the membrane- 
spanning segments are likely to be rich in hydrophobic 
residues . 

15 Surface residues may be identified experimentally by 

various labeling techniques, or by 3-D structure mapping 
techniques like X-ray diffraction and NMR. A 3-D model of a 
homologous protein can be helpful. 

20 Binding Site Residues 

Residues forming the binding site may be identified by 
(1) comparing the effects of labeling the surface residues 
before and after complexing the protein to its target, (2) 
labeling the binding site directly with affinity ligands, 

25 (3) fragmenting the protein and testing the fragments for 

binding activity, and (4) systematic mutagenesis (e.g., 
alanine -scanning mutagenesis) to determine which mutants 
destroy binding. If the binding site of a homologous 
protein is known, the binding site may be postulated by 

3 0 analogy. 

Protein libraries may be constructed and screened that 
a large family (e.g., 10 8 ) of related mutants may be 
evaluated simultaneously. 

Hence, the mutations are preferably conservative 
35 modifications as defined below. 



"Substantially Identical" 

A mutant protein (peptide) is substantially identical 
to a reference protein (peptide) if (a) it has at least 10% 
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of a specific binding activity or a non-nutritional 
biological activity of the reference protein, and (b) is at 
least 50% identical in amino acid sequence to the reference 
protein (peptide) . It is "substantially structurally 
5 identical" if condition (b) applies, regardless of (a) . 

Percentage amino acid identity is determined by 
aligning the mutant and reference sequences according to a 
rigorous dynamic programming algorithm which globally aligns 
their sequences to maximize their similarity, the similarity 

10 being scored as the sum of scores for each aligned pair 

according to an unbiased PAM2 50 matrix, and a penalty for 
each internal gap of -12 for the first null of the gap and - 
4 for each additional null of the same gap. The percentage 
identity is the number of matches expressed as a percentage 

15 of the adjusted (i.e., counting inserted nulls) length of 
the reference sequence . 

A mutant DNA sequence is substantially identical to a 
reference DNA sequence if they are structural sequences, and 
encoding mutant and reference proteins which are 

2 0 substantially identical as described above. 

If instead they are regulatory sequences, they are 
substantially identical if the mutant sequence has at least 
10% of the regulatory activity of the reference sequence, 
and is at least 50% identical in nucleotide sequence to the 

25 reference sequence. Percentage identity is determined as 

for proteins except that matches are scored +5, mismatches - 
4, the gap open penalty is -12, and the gap extension 
penalty (per additional null) is -4. 

More preferably, the sequence is not merely 

30 substantially identical but rather is at least 51%, at least 
66%, at least 75%, at least 80%, at least 85%, at least 
90%, at least 95%, at least 96%, at least 97%, at least 
98% or at least 99% identical in sequence to the reference 
sequence . 

35 DNA sequences may also be considered "substantially 

identical" if they hybridize to each other under stringent 
conditions, i.e., conditions at which the Tm of the 
heteroduplex of the one strand of the mutant DNA and the 
more complementary strand of the reference DNA is not in 
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excess of 10 °C. less than the Tm of the reference DNA 
homoduplex. Typically this will correspond to a percentage 
identity of 85-90%. 

5 "Conservative Modifications" 

"Conservative modifications" are defined as 

(a) conservative substitutions of amino acids as 
hereafter defined; or 

(b) single or multiple insertions (extension) or 
10 deletions (truncation) of amino acids at the 

termini . 

Conservative modifications are preferred to other 
modifications. Conservative substitutions are preferred to 
other conservative modifications. 

15 " Semi -Conservative Modifications" are modifications 

which are not conservative, but which are (a) semi- 
conservative substitutions as hereafter defined; or (b) 
single or multiple insertions or deletions internally, but 
at interdomain boundaries, in loops or in other segments of 

20 relatively high mobility. Semi -conservative modifications 
are preferred to nonconservative modifications. Semi- 
conservative substitutions are preferred to other semi- 
conservative modifications. 

Non- conservative substitutions are preferred to other 

25 non- conservative modifications. 

The term "conservative" is used here in an a priori 
sense, i.e., modifications which would be expected to 
preserve 3D structure and activity, based on analysis of the 
naturally occurring families of homologous proteins and of 

30 past experience with the effects of deliberate mutagenesis, 
rather than post facto , a modification already known to 
conserve activity. Of course, a modification which is 
conservative a priori may, and usually is, also conservative 
post facto . 

35 Preferably, except at the termini, no more than about 

five amino acids are inserted or deleted at a particular 
locus, and the modifications are outside regions known to 
contain binding sites important to activity. 
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Preferably, insertions or deletions are limited to the 
termini . 

A conservative substitution is a substitution of one 
amino acid for another of the same exchange group, the 
5 exchange groups being defined as follows 

I Gly, Pro, Ser, Ala (Cys) (and any nonbiogenic, 
neutral amino acid with a hydrophobicity not 
exceeding that of the aforementioned a.a.'s) 

II Arg, Lys, His (and any nonbiogenic, positively- 
10 charged amino acids) 

III Asp, Glu, Asn, Gin (and any nonbiogenic 
negatively-charged amino acids) 

IV Leu, lie, Met, Val (Cys) (and any nonbiogenic, 
aliphatic, neutral amino acid with a 

15 hydrophobicity too high for I above) 

V Phe, Trp, Tyr (and any nonbiogenic, aromatic 
neutral amino acid with a hydrophobicity too high 
for I above) . 

Note that Cys belongs to both I and IV. 

2 0 Residues Pro, Gly and Cys have special conformational 

roles. Cys participates in formation of disulfide bonds. 
Gly imparts flexibility to the chain. Pro imparts rigidity 
to the chain and disrupts of helices. These residues may be 
essential in certain regions of the polypeptide, but 
25 substitutable elsewhere. 

One, two or three conservative substitutions are more 
likely to be tolerated than a larger number. 

"Semi -conservative substitutions" are defined herein as 
being substitutions within supergroup I/II/III or within 

3 0 supergroup IV/V, but not within a single one of groups I-V. 

They also include replacement of any other amino acid with 
alanine. If a substitution is not conservative, it 
preferably is semi - conservative . 

"Non- conservative substitutions" are substitutions 
35 which are not "conservative" or "semi-conservative" . 

"Highly conservative substitutions" are a subset of 
conservative substitutions, and are exchanges of amino acids 
within the groups Phe/Tyr/Trp, Met/Leu/Ile/Val , His/Arg/Lys, 
Asp/Glu and Ser/Thr/Ala. They are more likely to be 
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tolerated than other conservative substitutions. Again, the 
smaller the number of substitutions, the more likely they 
are to be tolerated. 

5 "Conservatively Identical" 

A protein (peptide) is conservatively identical to a 
reference protein (peptide) it differs from the latter, if 
at all, solely by conservative modifications, the protein 
(peptide remaining at least seven amino acids long if the 
10 reference protein (peptide) was at least seven amino acids 
long. 

A protein is at least semi -conservatively identical to 
a reference protein (peptide) if it differs from the latter, 
if at all, solely by semi -conservative or conservative 

15 modifications. 

A protein (peptide) is nearly conservatively identical 
to a reference protein (peptide) if it differs from the 
latter, if at all, solely by one or more conservative 
modifications and/or a single nonconservative substitution. 

20 It is highly conservatively identical if it differs, if 

at all, solely by highly conservative substitutions. Highly 
conservatively identical proteins are preferred to those 
merely conservatively identical. An absolutely identical 
protein is even more preferred. 

25 

The core sequence of a reference protein (peptide) is 
the largest single fragment which retains at least 10% of a 
particular specific binding activity, if one is specified, 

30 or otherwise of at least one specific binding activity of 
the referent. If the referent has more than one specific 
binding activity, it may have more than one core sequence, 
and these may overlap or not . 

If it is taught that a peptide of the present invention 

35 may have a particular similarity relationship (e.g., 
markedly identical) to a reference protein (peptide) , 
preferred peptides are those which comprise a sequence 
having that relationship to a core sequence of the reference 
protein (peptide) , but with internal insertions or deletions 
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in either sequence excluded. Even more preferred peptides 
are those whose entire sequence has that relationship, with 
the same exclusion, to a core sequence of that reference 
protein (peptide) . 

Library 

The term "library" generally refers to a collection of 
chemical or biological entities which are related in origin, 
structure, and/or function, and which can be screened 
simultaneously for a property of interest . 

Libraries may be classified by how they are constructed 
(natural vs. artificial diversity; combinatorial vs. 
noncombinatorial) , how they are screened (hybridization, 
expression, display) , or by the nature of the screened 
library members (peptides, nucleic acids, etc.). 

In a "natural diversity" library, essentially all of 
the diversity arose without human intervention. This would 
be true, for example, of messenger RNA extracted from a non- 
engineered cell. 

In a "synthetic diversity" library, essentially all of 
the diversity arose deliberately as a result of human 
intervention. This would be true for example of a 
combinatorial library; note that a small level of natural 
diversity could still arise as a result of spontaneous 
mutation. It would also be true of a noncombinatorial 
library of compounds collected from diverse sources, even if 
they were all natural products. 

In a "non-natural diversity" library, at least some of 
the diversity arose deliberately through human intervention. 

In a "controlled origin" library, the source of the 
diversity is limited in some way. A limitation might be to 
cells of a particular individual, to a particular species, 
or to a particular genus, or, more complexly, to individuals 
of a particular species who are of a particular age, sex, 
physical condition, geographical location, occupation and/or 
familial relationship. Alternatively or additionally, it 
might be to cells of a particular tissue or organ. Or it 
could be cells exposed to particular pharmacological, 
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environmental, or pathogenic conditions. Or the library- 
could be of chemicals, or a particular class of chemicals, 
produced by such cells. 

In a "controlled structure" library, the library 
members are deliberately limited by the production 
conditions to particular chemical structures. For example, 
if they are oligomers, they may be limited in length and 
monomer composition, e.g. hexapeptides composed of the 
twenty genetically encoded amino acids. 

Hybridization Library 

In a hybridization library, the library members are 
nucleic acids, and are screened using a nucleic acid 
hybridization probe. Bound nucleic acids may then be 
amplified, cloned, and/or sequenced. 

Expression Library 

In an expression library, the screened library members 
are gene expression products, but one may also speak of an 
underlying library of genes encoding those products. The 
library is made by subcloning DNA encoding the library 
members (or portions thereof) into expression vectors (or 
into cloning vectors which subsequently are used to 
construct expression vectors) , each vector comprising an 
expressible gene encoding a particular library member, 
introducing the expression vectors into suitable cells, and 
expressing the genes so the expression products are 
produced . 

In one embodiment, the expression products are 
secreted, so the library can be screened using an affinity 
reagent, such as an antibody or receptor. The bound 
expression products may be sequenced directly, or their 
sequences inferred by, e.g., sequencing at least the 
variable portion of the encoding DNA. 

In a second embodiment, the cells are lysed, thereby 
exposing the expression products, and the latter are 
screened with the affinity reagent. 

In a third embodiment, the cells express the library 
members in such a manner that they are displayed on the 
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surface of the cells, or on the surface of viral particles 
produced by the cells. (See display libraries, below) . 

In a fourth embodiment, the screening is not for the 
ability of the expression product to bind to an affinity 
5 reagent, but rather for its ability to alter the phenotype 
of the host cell in a particular detectable manner. Here, 
the screened library members are transformed cells, but 
there is a first underlying library of expression products 
which mediate the behavior of the cells, and a second 
10 underlying library of genes which encode those products. 

Display Library 

In a display library, the library members are each 
conjugated to, and displayed upon, a support of some kind. 

15 The support may be living (a cell or virus) , or nonliving 
(e.g., a bead or plate) . 

If the support is a cell or virus, display will 
normally be effectuated by expressing a fusion protein which 
comprises the library member, a carrier moiety allowing 

20 integration of the fusion protein into the surface of the 
cell or virus, and optionally a lining moiety. In a 
variation on this theme, the cell coexpresses a first fusion 
comprising the library member and a linking moiety LI, and a 
second fusion comprising a linking moiety L2 and the carrier 

25 moiety. LI and L2 interact to associate the first fusion 
with the second fusion and hence, indirectly, the library 
member with the surface of the cell or virus. 

Soluble Library 

30 In a soluble library, the library members are free in 

solution. A soluble library may be produced directly, or 
one may first make a display library and then release the 
library members from their supports. 

3 5 Encapsulated Library 

In an encapsulated library, the library members are 
inside cells or liposomes. Generally speaking, encapsulated 
libraries are used to store the library members for future 
use; the members are extracted in some way for screening 
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purposes. However, if they differentially affect the 
phenotype of the cells, they may be screened indirectly by 
screening the cells. 



5 cDNA Library 

A cDNA library is usually prepared by extracting RNA 
from cells of particular origin, fractionating the RNA to 
isolate the messenger RNA (mRNA has a poly (A) tail, so this 
is usually done by oligo-dT affinity chromatography) , 
10 synthesizing complementary DNA (cDNA) using reverse 

transcriptase, DNA polymerase, and other enzymes, subcloning 
the cDNA into vectors, and introducing the vectors into 
cells. Often, only mRNAs or cDNAs of particular sizes will 
be used, to make it more likely that the cDNA encodes a 
15 functional polypeptide. 

A cDNA library explores the natural diversity of the 
transcribed DNAs of cells from a particular source. It is 
not a combinatorial library. 

A cDNA library may be used to make a hybridization 
20 library, or it may be used as an (or to make) expression 
library. ^ 

Genomic DNA Library 

A genomic DNA library is made by extracting DNA from a 
25 particular source, fragmenting the DNA, isolating fragments 
of a particular size range, subcloning the DNA fragments 
into vectors, and introducing the vectors into cells. 

Like a cDNA library, a genomic DNA library is a natural 
diversity library, and not a combinatorial library. A 
3 0 genomic DNA library may be used the same way as a cDNA 
library . 

Synthetic DNA library 

A synthetic DNA library may be screened directly (as a 
35 hybridization library) , or used in the creation of an 
expression or display library of peptides/proteins . 



Combinatorial Libraries 
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The term "combinatorial library" refers to a library in 
which the individual members are either systematic or random 
combinations of a limited set of basic elements, the 
properties of each member being dependent on the choice and 
5 location of the elements incorporated into it . Typically, 
the members of the library are at least capable of being 
screened simultaneously. Randomization may be complete or 
partial; some positions may be randomized and others 
predetermined, and at random positions, the choices may be 

10 limited in a predetermined manner. The members of a 

combinatorial library may be oligomers or polymers of some 
kind, in which the variation occurs through the choice of 
monomeric building block at one or more positions of the 
oligomer or polymer, and possibly in terms of the connecting 

15 linkage, or the length of the oligomer or polymer, too. Or 
the members may be nonoligomeric molecules with a standard 
core structure, like the 1 , 4 -benzodiazepine structure, with 
the variation being introduced by the choice of substituents 
at particular variable sites on the core structure. Or the 

2 0 members may be nonoligomeric molecules assembled like a 

jigsaw puzzle, but wherein each piece has both one or more 
variable moieties (contributing to library diversity) and 
one or more constant moieties (providing the functionalities 
for coupling the piece in question to other pieces) . 

25 Thus, in a typical combinatorial library, chemical 

building blocks are at least partially randomly combined 
into a large number (as high as 10 15 ) of different compounds, 
which are then simultaneously screened for binding (or 
other) activity against one or more targets. 

30 In a "simple combinatorial library", all of the members 

belong to the same class of compounds (e.g., peptides) and 
can be synthesized simultaneously. A "composite 
combinatorial library" is a mixture of two or more simple 
libraries, e.g., DNAs and peptides, or peptides, peptoids, 

35 and PNAs, or benzodiazepines and carbamates. The number of 
component simple libraries in a composite library will, of 
course, normally be smaller than the average number of 
members in each simple library, as otherwise the advantage 
of a library over individual synthesis is small . 
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Libraries of thousands, even millions, of random 
oligopeptides have been prepared by chemical synthesis 
(Houghten et al . , Nature, 354:84-6(1991)), or gene 
expression (Marks et al . , J Mol Biol, 222:581-97(1991)), 
5 displayed on chromatographic supports (Lam et al . , Nature, 
354:82-4(1991)), inside bacterial cells (Colas et al., 
Nature, 380:548-550(1996)), on bacterial pili (Lu, 
Bio/Technology, 13:366-372(1990)), or phage (Smith, Science, 
228:1315-7(1985)), and screened for binding to a variety of 

10 targets including antibodies (Valadon et al . , J Mol Biol, 
261:11-22(1996)), cellular proteins (Schmitz et al., J Mol 
Biol, 260:664-677(1996)), viral proteins (Hong and 
Boulanger, Embo J, 14:4 714-4727(1995)), bacterial proteins 
(Jacobsson and Frykberg, Biotechniques , 18:878-885(1995)), 

15 nucleic acids (Cheng et al . , Gene, 171:1-8(1996)), and 
plastic (Siani et al . , J Chem Inf Comput Sci, 34:588- 
593 (1994) ) . 

Libraries of proteins (Ladner, USP 4,664,989), peptoids 
(Simon et al . , Proc Natl Acad Sci USA, 89:9367-71(1992)), 

20 nucleic acids (Ellington and Szostak, Nature, 

246:818(1990)), carbohydrates, and small organic molecules 
(Eichler et al., Med Res Rev, 15:481-96(1995)) have also 
been prepared or suggested for drug screening purposes . 
The first combinatorial libraries were composed of 

25 peptides or proteins, in which all or selected amino acid 

positions were randomized. Peptides and proteins can exhibit 
high and specific binding activity, and can act as 
catalysts. In consequence, they are of great importance in 
biological systems . 

3 0 Nucleic acids have also been used in combinatorial 

libraries. Their great advantage is the ease with which a 
nucleic acid with appropriate binding activity can be 
amplified. As a result, combinatorial libraries composed of 
nucleic acids can be of low redundancy and hence, of high 

35 diversity. 

There has also been much interest in combinatorial 
libraries based on small molecules, which are more suited to 
pharmaceutical use, especially those which, like 
benzodiazepines, belong to a chemical class which has 
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already yielded useful pharmacological agents. The 
techniques of combinatorial chemistry have been recognized 
as the most efficient means for finding small molecules that 
act on these targets. At present, small molecule 
5 combinatorial chemistry involves the synthesis of either 

pooled or discrete molecules that present varying arrays of 
functionality on a common scaffold. These compounds are 
grouped in libraries that are then screened against the 
target of interest either for binding or for inhibition of 

10 biological activity. 

The size of a library is the number of molecules in it. 
The simple diversity of a library is the number of unique 
structures in it. There is no formal minimum or maximum 
diversity. If the library has a very low diversity, the 

15 library has little advantage over just synthesizing and 

screening the members individually. If the library is of 
very high diversity, it may be inconvenient to handle, at 
least without automatizing the process. The simple 
diversity of a library is preferably at least 10, 10E2, 

20 10E3, 10E4, 10E6, 10E7, 10E8 or 10E9, the higher the better 
under most circumstances. The simple diversity is usually 
not more than 10E15, and more usually not more than 10E10. 

The average sampling level is the size divided by the 
simple diversity. The expected average sampling level must 

25 be high enough to provide a reasonable assurance that, if a 
given structure were expected, as a consequence of the 
library design, to be present, that the actual average 
sampling level will be high enough so that the structure, if 
satisfying the screening criteria, will yield a positive 

30 result when the library is screened. Thus, the preferred 

average sampling level is a function of the detection limit, 
which in turn is a function of the strength of the signal to 
be screened. 

There are more complex measures of diversity than 
3 5 simple diversity. These attempt to take into account the 
degree of structural difference between the various unique 
sequences. These more complex measures are usually used in 
the context of small organic compound libraries, see below. 
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The library members may be presented as solutes in 
solution, or immobilized on some form of support. In the 
latter case, the support may be living (cell, virus) or 
nonliving (bead, plate, etc.) . The supports may be separable 
5 (cells, virus particles, beads) so that binding and 
nonbinding members can be separated, or nonseparable 
(plate) . In the latter case, the members will normally be 
placed on addressable positions on the support. The 
advantage of a soluble library is that there is no carrier 

10 moiety that could interfere with the binding of the members 
to the support. The advantage of an immobilized library is 
that it is easier to identify the structure of the members 
which were positive. 

When screening a soluble library, or one with a 

15 separable support, the target is usually immobilized. When 
screening a library on a nonseparable support, the target 
will usually be labeled. 

Oligonucleotide Libraries 
20 An oligonucleotide library is a combinatorial library, 

at least some of whose members are single-stranded 
oligonucleotides having three or more nucleotides connected 
by phosphodiester or analogous bonds. The oligonucleotides 
may be linear, cyclic or branched, and may include non- 
25 nucleic acid moieties. The nucleotides are not limited to 
the nucleotides normally found in DNA or RNA. For examples 
of nucleotides modified to increase nuclease resistance and 
chemical stability of aptamers, see Chart 1 in Osborne and 
Ellington, Chem. Rev., 97: 349-70 (1997). For screening of 
30 RNA, see Ellington and Szostak, Nature, 346: 818-22 (1990). 

There is no formal minimum or maximum size for these 
oligonucleotides. However, the number of conformations which 
an oligonucleotide can assume increases exponentially with 
its length in bases. Hence, a longer oligonucleotide is 
35 more likely to be able to fold to adapt itself to a protein 
surface. On the other hand, while very long molecules can 
be synthesized and screened, unless they provide a much 
superior affinity to that of shorter molecules, they are not 
likely to be found in the selected population, for the 
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reasons explained by Osborne and Ellington (1997) . Hence, 
the libraries of the present invention are preferably- 
composed of oligonucleotides having a length of 3 to 100 
bases, more preferably 15 to 35 bases. The oligonucleotides 
5 in a given library may be of the same or of different 
lengths . 

Oligonucleotide libraries have the advantage that 
libraries of very high diversity (e.g., 10 1S ) are feasible, 
and binding molecules are readily amplified in vitro by 
10 polymerase chain reaction (PCR) . Moreover, nucleic acid 
molecules can have very high specificity and affinity to 
targets . 

In a preferred embodiment, this invention prepares and 
screens oligonucleotide libraries by the SELEX method, as 
15 described in King and Famulok, Molec. Biol. Repts., 20: 97- 
107 (1994) ; L. Gold, C. Tuerk. Methods of producing nucleic 
acid ligands, US#5595877; Oliphant et al . Gene 44:177 
(1986) . 

The term "aptamer" is conferred on those 

2 0 oligonucleotides which bind the target protein. Such 

aptamers may be used to characterize the target protein, 
both directly (through identification of the aptamer and the 
points of contact between the aptamer and the protein) and 
indirectly (by use of the aptamer as a ligand to modify the 

25 chemical reactivity of the protein) . 

In a classic oligonuclotide, each nucleotide (monomeric 
unit) is composed of a phosphate group, a sugar moiety, and 
either a purine or a pyrimidine base. In DNA, the sugar is 
deoxyribose and in RNA it is ribose . The nucleotides are 

30 linked by 5'-3' phosphodiester bonds. 

The deoxyribose phosphate backbone of DNA can be 
modified to increase resistance to nuclease and to increase 
penetration of cell membranes. Derivatives such as mono- or 
dithiophosphates , methyl phosphonates , boranophosphates , 

35 formacetals, carbamates, siloxanes, and dimethylenethio- - 
sulfoxideo- and-sulfono- linked species are known in the 
art . 

Peptide Library 
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A peptide is composed of a plurality of amino acid 
residues joined together by peptidyl (-NHCO-) bonds. A 
biogenic peptide is a peptide in which the residues are all 
genetically encoded amino acid residues; it is not necessary 
5 that the biogenic peptide actually be produced by gene 
expression . 

Amino acids are the basic building blocks with which 
peptides and proteins are constructed. Amino acids possess 
both an amino group ( -NH 2 ) and a carboxylic acid group (- 

10 COOH) . Many amino acids, but not all, have the alpha amino 

acid structure NH 2 -CHR-COOH, where R is hydrogen, or any of a 
variety of functional groups. 

Twenty amino acids are genetically encoded: Alanine, 
Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic 

15 Acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, 
Lysine, Methionine, Phenylalanine, Proline, Serine, 
Threonine, Tryptophan, Tyrosine, and Valine. Of these, all 
save Glycine are optically isomeric, however, only the L- 
form is found in humans. Nevertheless, the D- forms of these 

2 0 amino acids do have biological significance; D-Phe, for 
example, is a known analgesic. 

Many other amino acids are also known, including: 2- 
Aminoadipic acid; 3 -Aminoadipic acid; beta-Aminopropionic 
acid; 2 -Aminobutyric acid; 4 -Aminobutyric acid (Piperidinic 

25 acid) ; 6-Aminocaproic acid; 2 -Aminoheptanoic acid; 2- 

Aminoisobutyric acid, 3 -Aminoisobutyric acid; 2 -Aminopimelic 
acid; 2 , 4 -Diaminobutyric acid; Desmosine; 2,2'- 
Diaminopimelic acid; 2 , 3 -Diaminopropionic acid; N- 
Ethylglycine ; N-Ethylasparagine ,- Hydroxylysine ,- allo- 

30 Hydroxylysine; 3 -Hydroxyproline ; 4 -Hydroxyproline ; 

Isodesmosine ; allo-Isoleucine; N-Methylglycine (Sarcosine) ; 
N-Methyl isoleucine ; N-Methylval ine ; Norvaline; Norleucine; 
and Ornithine. 

Peptides are constructed by condensation of amino acids 
35 and/or smaller peptides. The amino group of one amino acid 
(or peptide) reacts with the carboxylic acid group of a 
second amino acid (or peptide) to form a peptide (-NHC0-) 
bond, releasing one molecule of water. Therefore, when an 
amino acid is incorporated into a peptide, it should, 
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technically speaking, be referred to as an amino acid 
residue . The core of that residue is the moiety which 
excludes the -NH and -CO linking functionalities which 
connect it to other residues. This moiety consists of one 
5 or more main chain atoms (see below) and the attached side 
chains . 

The main chain moiety of each amino acid consists of 
the -NH and -CO linking functionalities and a core main 
chain moiety. Usually the latter is a single carbon atom. 

10 However, the core main chain moiety may include additional 
carbon atoms, and may also include nitrogen, oxygen or 
sulfur atoms, which together form a single chain. In a 
preferred embodiment, the core main chain atoms consist 
solely of carbon atoms. 

15 The side chains are attached to the core main chain 

atoms. For alpha amino acids, in which the side chain is 
attached to the alpha carbon, the C-l, C-2 and N-2 of each 
residue form the repeating unit of the main chain, and the 
word "side chain" refers to the C-3 and higher numbered 

20 carbon atoms and their substituents . It also includes H 
atoms attached to the main chain atoms. 

Amino acids may be classified according to the number 
of carbon atoms which appear in the main chain between the 
carbonyl carbon and amino nitrogen atoms which participate 

25 in the peptide bonds. Among the 150 or so amino acids which 
occur in nature, alpha, beta, gamma and delta amino acids 
are known. These have 1-4 intermediary carbons. Only alpha 
amino acids occur in proteins. Proline is a special case of 
an alpha amino acid; its side chain also binds to the 

3 0 peptide bond nitrogen. 

For beta and higher order amino acids, there is a 
choice as to which main chain core carbon a side chain other 
than H is attached to. The preferred attachment site is the 
C-2 (alpha) carbon, i.e., the one adjacent to the carboxyl 

35 carbon of the -CO linking functionality. It is also possible 
for more than one main chain atom to carry a side chain 
other than H. However, in a preferred embodiment, only one 
main chain core atom carries a side chain other than H. 
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A main chain carbon atom may carry either one or two 
side chains; one is more common. A side chain may be 
attached to a main chain carbon atom by a single or a double 
bond; the former is more common. 
5 A simple combinatorial peptide library is one whose 

members are peptides having three or more amino acids 
connected via peptide bonds. 

The peptides may be linear, branched, or cyclic, and 
may covalently or noncovalently include nonpeptidyl 
10 moieties. The amino acids are not limited to the naturally 
occurring or to the genetically encoded amino acids. 

A biased peptide library is one in which one or more 
(but not all) residues of the peptides are constant 
residues. 

15 

Cyclic Peptides 

Many naturally occurring peptides are cyclic. 
Cyclization is a common mechanism for stabilization of 
peptide conformation thereby achieving improved association 

20 of the peptide with its ligand and hence improved biological 
activity. Cyclization is usually achieved by intra-chain 
cystine formation, by formation of peptide bond between side, 
chains or between N- and C- terminals. Cyclization was 
usually achieved by peptides in solution, but several 

25 publications have appeared that describe cyclization of 
peptides on beads . 

A peptide library may be an oligopeptide library or a 
protein library. 

3 0 Oligopeptides 

Preferably, the oligopeptides are at least five, six, 
seven or eight amino acids in length. Preferably, they are 
composed of less than 50, more preferably less than 20 amino 
acids . 

35 In the case of an oligopeptide library, all or just 

some of the residues may be variable. The oligopeptide may 
be unconstrained, or constrained to a particular 
conformation by, e.g., the participation of constant 
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cysteine residues in the formation of a constraining 
disulfide bond. 

Proteins 

5 Proteins, like oligopeptides, are composed of a 

plurality of amino acids, but the term protein is usually 
reserved for longer peptides, which are able to fold into a 
stable conformation. A protein may be composed of two or 
more polypeptide chains, held together by covalent or 

10 noncovalent crosslinks. These may occur in a homooligomeric 
or a heterooligomeric state. 

A peptide is considered a protein if it (1) is at least 
50 amino acids long, or (2) has at least two stabilizing 
covalent crosslinks (e.g., disulfide bonds). Thus, 

15 conotoxins are considered proteins. 

Usually, the proteins of a protein library will be 
characterizable as having both constant residues (the same 
for all proteins in the library) and variable residues 
(which vary from member to member) . This is simply because, 

20 for a given range of variation at each position, the 

sequence space (simple diversity) grows exponentially with 
the -number of residue positions, so at some point it becomes 
inconvenient for all residues of a peptide to be variable 
positions. Since proteins are usually larger than 

25 oligopeptides, it is more common for protein libraries than 
oligopeptide libraries to feature variable positions. 

In the case of a protein library, it is desirable to 
focus the mutations at those sites which are tolerant of 
mutation. These may be determined by alanine scanning 

30 mutagenesis or by comparison of the protein sequence to that 
of homologous proteins of similar activity. It is also more 
likely that mutation of surface residues will directly 
affect binding. Surface residues may be determined by 
inspecting a 3D structure of the protein, or by labeling the 

3 5 surface and then ascertaining which residues have received 

labels. They may also be inferred by identifying regions of 
high hydrophilicity within the protein. 
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Because proteins are often altered at some sites but 
not others, protein libraries can be considered a special 
case of the biased peptide library. 

There are several reasons that one might screen a 
5 protein library instead of an oligopeptide library, 

including (1) a particular protein, mutated in the library, 
has the desired activity to some degree already, and (2) the 
oligopeptides are not expected to have a sufficiently high 
affinity or specificity since they do not have a stable 
10 conformation. 

When the protein library is based on a parental protein 
which does not have the desired activity, the parental 
protein will usually be one which is of high stability 
(melting point >= 50 deg. C.) and/or possessed of 
15 hypervariable regions. 

The variable domains of an antibody possess 
hypervariable regions and hence, in some embodiments, the 
protein library comprises members which comprise a mutant of 
VH or VL chain, or a mutant of an antigen-specific binding 
2 0 fragment of such a chain. VH and VL chains are usually each 
about 110 amino acid residues, and are held in proximity by 
a disulfide bond between the adjoing CL and CHI regions to 
form a variable domain. Together, the VH, VL, CL and CHI 
form an Fab fragment . 

2 5 In human heavy chains, the hypervariable regions are at 

31-35, 49-65, 98-111 and 84-88, but only the first three are 
involved in antigen binding. There is variation among VH 
and VL chains at residues outside the hypervariable regions, 
but to a much lesser degree. 

3 0 A sequence is considered a mutant of a VH or VL chain 

if it is at least 80% identical to a naturally occurring VH 
or VL chain at all residues outside the hypervariable 
region . 

In a preferred embodiment, such antibody library 
3 5 members comprise both at least one VH chain and at least one 
VL chain, at least one of which is a mutant chain, and which 
chains may be derived from the same or different antibodies. 
The VH and VL chains may be covalently joined by a suitable 
linker moiety, as in a "single chain antibody" , or they may 
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be noncovalently joined, as in a naturally occurring 
variable domain. 

If the joining is noncovalent , and the library is 
displayed on cells or virus, then either the VH or the VL 
chain may be fused to the carrier surface/coat protein. The 
complementary chain may be co- expressed, or added 
exogenously to the library. 

The members may further comprise some or all of an 
antibody constant heavy and/or constant light chain, or a 
mutant thereof . 

Peptoid Library 

A peptoid is an analogue of a peptide in which one or 
more of the peptide bonds (-NH-CO-) are replaced by 
pseudopeptide bonds, which may be the same or different. It 
is not necessary that all; of the peptide bonds be replaced, 
i.e., a peptoid may include one or more conventional amino 
acid residues, e.g., proline. 

A peptide bond has two small divalent linker elements, 
-NH- and -CO-. Thus, a preferred class of psuedopeptide 
bonds are those which consist of two small divalent linker 
elements. Each may be chosen independently from the group 
consisting of amine (-NH-), substituted amine (-NR-), 
carbonyl (-CO-) , thiocarbonyl (-CS-) , methylene (-CH2-) , 
monosubstituted methylene (-CHR-), disubstituted methylene 
(-CR1R2-) , ether (-O-) and thioether (-S-) . The more 
preferred pseudopeptide bonds include: 

N-modified -NRCO- 

Carba ¥ -CH 2 -CH 2 - 

Depsi W -CO-O- 

Hydroxyethylene W -CHOH-CH 2 - 

Ketomethylene ^ -CO-CH 2 - 

Methylene-Oxy -CH 2 -0- 

Reduced -CH 2 -NH- 

Thiomethylene -CH 2 -S- 

Thiopeptide -CS-NH- 

Retro-Inverso -CO-NH- 
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A single peptoid molecule may include more than one 
kind of pseudopept ide bond. 

For the purposes of introducing diversity into a 
peptoid library, one may vary (1) the side chains attached 
5 to the core main chain atoms of the monomers linked by the 

pseudopeptide bonds, and/or (2) the side chains (e.g., the - 
R of an -NRCO- ) of the pseudopeptide bonds. Thus, in one 
embodiment, the monomeric units which are not amino acid 
residues are of the structure -NR1-CR2-CO- , where at least 
10 one of Rl and R2 are not hydrogen. If there is variability 
in the pseudopeptide bond, this is most conveniently done by 
using an -NRCO- or other pseudopeptide bond with an R group, 
and varying the R group. In this event, the R group will 
usually be any of the side chains characterizing the amino 
15 acids of peptides, as previously discussed. 

If the R group of the pseudopeptide bond is not 
variable, it will usually be small, e.g., not more than 10 
atoms (e.g., hydroxyl , amino, carboxyl , methyl, ethyl, 
propyl) . 

20 If the conjugation chemistries are compatible, a simple 

combinatorial library may include both peptide's and 
peptoids . 

Peptide Nucleic Acid Library 
25 A PNA oligomer is here defined as one comprising a 

plurality of units, at least one of which is a PNA monomer 

which comprises a side chain comprising a nucleobase. For 

nucleobases, see USP 6,077,835. 

The classic PNA oligomer is composed of (2- 
30 aminoethyl) glycine units, with nucleobases attached by 

methylene carbonyl linkers. That is, it has the structure 

H- (-HN-CH 2 -CH 2 -N(-CO-CH 2 -B) -CH 2 -CO-) n -OH 

35 where the outer parenthesized substructure is the PNA 
monomer . 

In this structure, the nucleobase B is separated from 
the backbone N by three bonds, and the points of attachment 
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of the side chains are separated by six bonds. The 
nucleobase may be any of the bases included in the 
nucleotides discussed in connection with oligonucleotide 
libraries. The bases of nucleotides A, G, T, C and U are 
5 preferred. 

A PNA oligomer may further comprise one or more amino 
acid residues, especially glycine and proline. 

One can readily envision related molecules in which (1) 
the -COCH2- linker is replaced by another linker, especially 
10 one composed of two small divalent linkers as defined 

previously, (2) a side chain is attached to one of the three 
main chain carbons not participating in the peptide bond 
(either instead or in addition to the side chain attached to 
the N of the classic PNA) ; and/or (3) the peptide bonds are 
15 replaced by pseudopeptide bonds as disclosed previously in 
the context of peptoids. 

PNA oligomer libraries have been made; see e.g. Cook, 
6,204,326. 

2 0 Small Organic Compound Library 

The small organic compound library ("compound library", 
for short) is a combinatorial library whose members are 
suitable for use as drugs if, indeed, they have the ability 
to mediate a biological activity of the target protein. 

25 Peptides have certain disadvantages as drugs. These 

include susceptibility to degradation by serum proteases, 
and difficulty in penetrating cell membranes. Preferably, 
all or most of the compounds of the compound library avoid, 
or at least do not suffer to the same degree, one or more of 

30 the pharmaceutical disadvantages of peptides. 

In designing a compound library, it is helpful to bear 
in mind the methods of molecular modification typically used 
to obtain new drugs. Three basic kinds of modification may 
be identified: disjunction , in which a lead drug is 

35 simplified to identify its component pharmacophoric 
moieties; conjunction , in which two or more known 
pharmacophoric moieties, which may be the same or different, 
are associated, covalently or noncovalently, to form a new 
drug; and alteration , in which one moiety is replaced by 
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another which may be similar or different, but which is not 
in effect a disjunction or conjunction. The use of the 
terms "disjunction", "conjunction" and "alteration" is 
intended only to connote the structural relationship of the 
5 end product to the original leads , and not how the new drugs 
are actually synthesized, although it is possible that the 
two are the same . 

The process of disjunction is illustrated by the 
evolution of neostigmine (1931) and edrophonium (1952) from 

10 physostigmine (1925) . Subsequent conjunction is illustrated 
by demecarium (1956) and ambenonium (1956) . 

Alterations may modify the size, polarity, or electron 
distribution of an original moiety. Alterations include 
ring closing or opening, formation of lower or higher 

15 homologues, introduction or saturation of double bonds, 
introduction of optically active centers, introduction, 
removal or replacement of bulky groups, isosteric or 
bioisosteric substitution, changes in the position or 
orientation of a group, introduction of alkylating groups, 

2 0 and introduction, removal or replacement of groups with a 
view toward inhibiting or promoting inductive 
(electrostatic) or conjugative (resonance) effects. 

Thus, the substituents may include electron acceptors 
and/or electron donors. Typical electron donors (+1) 

25 include -CH 3 , -CH 2 R, -CHR 2 , -CR 3 and -COO". Typical electron 
acceptors (-1) include -NH 3 +, -NR 3 +, -NO z , -CN, -COOH, -COOR, 
-CHO, -COR, -COR, -F, -CI, -Br, -OH, -OR, -SH, -SR, -CH=CH 2 , 
-CR=CR 2 , and -C=CH. 

The substituents may also include those which increase 

30 or decrease electronic density in conjugated systems. The 

former (+R) groups include -CH 3 , -CR 3 , -F, -CI, -Br, -I, -OH, 
-OR, -OCOR, -SH, -SR, -NH 2 , -NR 2 , and -NHCOR. The later (-R) 
groups include -N0 2 , -CN, -CHC, -COR, -COOH, -COOR, -CONH 2 , 
-SO z R and -CF 3 . 

35 Synthetically speaking, the modifications may be 

achieved by a variety of unit processes, including 
nucleophilic and electrophilic substitution, reduction and 
oxidation, addition elimination, double bond cleavage, and 
cyclization. 
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For the purpose of constructing a library, a compound, 
or a family of compounds, having one or more pharmacological 
activities (which need not be related to the known or 
suspected activities of the target protein) , may be 
5 disjoined into two or more known or potential pharmacophoric 
moieties. Analogues of each of these moieties may be 
identified, and mixtures of these analogues reacted so as to 
reassemble compounds which have some similarity to the 
original lead compound. It is not necessary that all 

10 members of the library possess moieties analogous to all of 
the moieties of the lead compound. 

The design of a library may be illustrated by the 
example of the benzodiazepines. Several benzodiazepine 
drugs, including chlordiazepoxide , diazepam and oxazepam, 

15 have been used as ant i -anxiety drugs. Derivatives of 
benzodiazepines have widespread biological activities, - 
derivatives have been reported to act not only as 
anxiolytics, but also as anticonvulsants; cholecystokinin 
(CCK) receptor subtype A or B, kappa opioid receptor, 

20 platelet activating factor, and HIV transactivator Tat 

antagonists, and GPIIblla, reverse transcriptase and ras 
farnesyltransf erase inhibitors. 

The benzodiazepine structure has been disjoined into a 
2 -aminobenzophenone , an amino acid, and an alkylating agent. 

25 See Bunin, et al . , Proc . Nat. Acad. Sci . USA, 91:4708 

(1994). Since only a few 2 -aminobenzophenone derivatives 
are commercially available, it was later disjoined into 2- 
aminoarylstannane , an acid chloride, an amino acid, and an 
alkylating agent. Bunin, et al . , Meth. Enzymol . , 267:448 

30 (1996) . The arylstannane may be considered the core 

structure upon which the other moieties are substituted, or 
all four may be considered equals which are conjoined to 
make each library member. 

A basic library synthesis plan and member structure is 

3 5 shown in Figure 1 of Fowlkes, et al . , U.S. Serial No. 

08/740,671, incorporated by reference in its entirety. The 
acid chloride building block introduces variability at the R 1 
site. The R 2 site is introduced by the amino acid, and the 
R 3 site by the alkylating agent. The R 4 site is inherent in 




75 

the arylstannane . Bunin, et al . generated a 1, 4- 
benzodiazepine library of 11,200 different derivatives 
prepared from 20 acid chlorides, 35 amino acids, and 16 
alkylating agents. (No diversity was introduced at R 4 ; this 
5 group was used to couple the molecule to a solid phase.) 
According to the Available Chemicals Directory (HDL 
Information Systems, San Leandro CA) , over 300 acid 
chl orides, 80 Fmoc -protected amino acids and 800 alkylating 
agents were available for purchase (and more, of course, 

10 could be synthesized) . The particular moieties used were 

chosen to maximize structural dispersion, while limiting the 
numbers to those conveniently synthesized in the wells of a 
microtiter plate. In choosing between structurally similar 
compounds, preference was given to the least substituted 

15 compound. 

The variable elements included both aliphatic and 
aromatic groups. Among the aliphatic groups, both acyclic 
and cyclic (mono- or poly-) structures, substituted or not, 
were tested. (While all of the acyclic groups were linear, 

2 0 it would have been feasible to introduce a branched 

aliphatic) . The aromatic groups featured either single and , 
multiple rings, fused or not, substituted or not, and with 
heteroatoms or not. The secondary subst itutents included - 
NH 2 , -OH, -OMe, -CN, -CI, -F, and -COOH. While not used, 

25 spacer moieties, such as -O- , -S-, -OO-, -CS-, -NH- , and - 
NR- , could have been incorporated. 

Bunin et al . suggest that instead of using a 1, 4- 
benzodiazepine as a core structure, one may instead use a 1, 
4 -benzodiazepine-2 , 5-dione structure. 

30 As noted by Bunin et al . , it is advantageous, although 

not necessary, to use a linkage strategy which leaves no 
trace of the linking functionality, as this permits 
construction of a more diverse library. 

Other combinatorial nonoligomeric compound libraries 

3 5 known or suggested in the art have been based on carbamates, 

mercaptoacylated pyrrolidines, phenolic agents, aminimides, 
N-acylamino ethers (made from amino alcohols, aromatic 
hydroxy acids, and carboxylic acids) , N-alkylamino ethers 
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(made from aromatic hydroxy acids, amino alcohols and 
aldehydes) 1, 4 -piperazines , and 1, 4-piperazine-6-ones . 

DeWitt, et al . , Proc . Nat. Acad. Sci . (USA), 90:6909-13 
(1993) describe the simultaneous but separate, synthesis of 
5 40 discrete hydantoins and 40 discrete benzodiazepines. 

They carry out their synthesis on a solid support (inside a 
gas dispersion tube) , in an array format, as opposed to 
other conventional simultaneous synthesis techniques (e.g., 
in a well, or on a pin) . The hydantoins were synthesized by 

10 first simultaneously deprotecting and then treating each of 
five amino acid resins with each of eight isocyanates. The 
benzodiazepines were synthesized by treating each of five 
deprotected amino acid resins with each of eight 2 -amino 
benzophenone imines . 

15 Chen, et al . , J. Am. Chem. Soc . , 116:2661-62 (1994) 

described the preparation of a pilot (9 member) 
combinatorial library of formate esters. A polymer bead- 
bound aldehyde preparation was "split" into three aliquots, 
each reacted with one of three different ylide reagents. 

2 0 The reaction products were combined, and then divided into 

three new aliquots, each of which was reacted with a 
different Michael donor. Compound identity was found to be 
determinable on a single bead basis by gas 
chromatography/mass spectroscopy analysis. 

25 Holmes, USP 5,549,974 (1996) sets forth methodologies 

for the combinatorial synthesis of libraries of 
thiazolidinones and metathiazanones . These libraries are 
made by combination of amines, carbonyl compounds, and 
thiols under cyclization conditions. 

30 Ellman, USP 5,545,568 (1996) describes combinatorial 

synthesis of benzodiazepines, prostaglandins, beta- turn 
mimetics, and glycerol -based compounds. See also Ellman, 
USP 5,288,514. 

Summerton, USP 5,506,337 (1996) discloses methods of 

3 5 preparing a combinatorial library formed predominantly of 

morpholino subunit structures. 

Heterocylic combinatorial libraries are reviewed 
generally in Nefzi, et al . , Chem. Rev., 97:449-472 (1997). 
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For pharmacological classes, see, e.g., Goth, Medical 
Pharmacology: Principles and Concepts (C.V. Mosby Co. : 8th 
ed. 1976) ; Korolkovas and Burckhalter , Essentials of 
Medicinal Chemistry (John Wiley & Sons, Inc. : 19 76) . For 
5 synthetic methods, see, e.g., Warren, Organic Synthesis: The 
Disconnection Approach (John Wiley & Sons, Ltd. : 1982) ; 
Fuson, Reactions of Organic Compounds (John Wiley & Sons: 
1966) ; Payne and Payne, How to do an Organic Synthesis 
(Allyn and Bacon, Inc.: 1969); Greene, Protective Groups in 

10 Organic Synthesis (Wiley- Interscience) . For selection of 
substituents, see e.g., Hansch and Leo, Substituent 
Constants for Correlation Analysis in Chemistry and Biology 
(John Wiley & Sons: 1979) . 

The library is preferably synthesized so that the 

15 individual members remain identifiable so that, if a member 
is shown to be active, it is not necessary to analyze it. 
Several methods of identification have been proposed, 

i 

including : 

(1) encoding, i.e., the attachment to each member of 
2 0 an identifier moiety which is more readily 

identified than the member proper. This has the 
disadvantage that the tag may itself influence the 
activity of the conjugate. 

(2) spatial addressing, e.g., each member is 

25 synthesized only at a particular coordinate on or 

in a matrix, or in a particular chamber. This 
might be, for example, the location of a 
particular pin, or a particular well on a 
microtiter plate, or inside a "tea bag" . 
30 The present invention is not limited to any particular form 
of identification. 

However, it is possible to simply characterize those 
members of the library which are found to be active, based 
on the characteristic spectroscopic indicia of the various 
35 building blocks. 

Solid phase synthesis permits greater control over 
which derivatives are formed. However, the solid phase 
could interfere with activity. To overcome this problem. 
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some or all of the molecules of each member could be 
liberated, after synthesis but before screening. 

Examples of candidate simple libraries which might be 
evaluated include derivatives of the following: 
5 Cyclic Compounds Containing One Hetero Atom 

Heteronitrogen 
pyrroles 

pentasubstituted pyrroles 
pyrrol idines 
10 pyrrol ines 

prolines 
indoles 

beta-carbolines 
pyridines 

15 dihydropyridines 

1 , 4 -dihydropyridines 
pyrido [2 , 3 -d] pyrimidines 
tetrahydro-3H-imidazo [4 , 5-c] pyridines 
Isoquinolines 

2 0 tetrahydroisoquinolines 

quinolones 
beta-lactams 

azabicyclo [4 . 3 . 0] nonen-8-one amino acid 
Heterooxygen 
25 furans 

tetrahydrof urans 

2 , 5-disubstituted tetrahydrof urans 

pyrans 

hydroxypyranones 

3 0 tetrahydroxypyranones 

gamma -butyrolactones 
Heterosulf ur 

sulf olenes 

Cyclic Compounds with Two or More Hetero atoms 
35 Multiple heteronitrogens 

imidazoles 
pyrazoles 
piperazines 

dike topiperaz ines 
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arylpiperazines 
benzylpiperazines 
benzodiazepines 

1 , 4 -benzodiazepine-2 , 5-diones 
5 hydantoins 

5 - a 1 koxyhydan t o i n s 
dihydropyrimi dines 

1, 3 -disubstituted-5 , 6 -dihydopyrimidine-2 , 4 
10 diones 

cyclic ureas 
cyclic thioureas 
quinazolines 

chiral 3 -subst ituted-quinazoline-2 , 4- 

15 diones 

triazoles 

1,2, 3 -triazoles 
purines 

Heteronitrogen and Heterooxygen 

2 0 dikelomorpholines 

isoxazoles 
isoxazolines 
Heteronitrogen and Heterosulfur 
thiazolidines 
25 N-axyl thiazolidines 

dihydrothiazoles 

2 -methyl ene-2 , 3 -dihydrothiazates 
2 -aminothiazoles 
thiophenes 

3 0 3 -amino thiophenes 

4 - thiazolidinones 
4 -melathiazanones 
benzisothiazolones 
For details on synthesis of libraries, see Nefzi, et 
35 al., Chem. Rev., 97:449-72 (1997) , and references cited 
therein. 
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The preferred animal subject of the present invention 
is a mammal. By the term "mammal" is meant an individual 
belonging to the class Mammalia. The invention is 
particularly useful in the treatment of human subjects, 
5 although it is intended for veterinary and nutritional uses 
as well. Preferred nonhuman subjects are of the orders 
Primata (e.g., apes and monkeys), Artiodactyla or 
Perissodactyla (e.g., cows, pigs, sheep, horses, goats), 
Carnivora (e.g., cats, dogs), Rodenta (e.g., rats, mice, 

10 guinea pigs, hamsters), Lagomorpha (e.g., rabbits) or other 
pet, farm or laboratory mammals. 

The term "protection" , as used herein, is intended to 
include "prevention," "suppression" and "treatment." 
"Prevention", strictly speaking, involves administration of 

15 the pharmaceutical prior to the induction of the disease (or 
other adverse clinical condition) . "Suppression" involves 
administration of the composition prior to the clinical 
appearance of the disease. "Treatment" involves 
administration of the protective composition after the 

20 appearance of the disease. 

It will be understood that in human and veterinary 
medicine, it is not always possible to distinguish between 
"preventing" and "suppressing" since the ultimate inductive 
event or events may be unknown, latent, or the patient is 

25 not ascertained until well after the occurrence of the event 
or events. Therefore, unless qualified, the term 
"prevention" will be understood to refer to both prevention 
in the strict sense, and to suppression. 

The preventative or prophylactic use of a 

3 0 pharmaceutical usually involves identifying subjects who are 
at higher risk than the general population of contracting 
the disease, and administering the pharmaceutical to them in 
advance of the clinical appearance of the disease. The 
effectiveness of such use is measured by comparing the 

3 5 subsequent incidence or severity of the disease, or of 

particular symptoms of the disease, in the treated subjects 
against that in untreated subjects of the same high risk 
group . 




81 

While high risk factors vary from disease to disease, 
in general, these include (1) prior occurrence of the 
disease in one or more members of the same family, or, in 
the case of a contagious disease, in individuals with whom 
5 the subject has come into potentially contagious contact at 
a time when the earlier victim was likely to be contagious, 
(2) a prior occurrence of the disease in the subject, (3) 
prior occurrence of a related disease, or a condition known 
to increase the likelihood of the disease, in the subject; 

10 (4) appearance of a suspicious level of a marker of the 

disease, or a related disease or condition; (5) a subject 
who is immunologically compromised, e.g., by radiation 
treatment, HIV infection, drug use,, etc., or (6) membership 
in a particular group (e.g., a particular age, sex, race, 

15 ethnic group, etc.) which has been epidemiologically 
associated with that disease. 

In some cases, it may be desirable to provide 
prophylaxis for the general population, and not just a high 
risk group. This is most likely to be the case when 

20 essentially all are at risk of contracting the disease, the 
effects of the disease are serious, the therapeutic index of 
the prophylactic agent is high, and the cost of the agent is. 
low. 

A prophylaxis or treatment may be curative, that is, 
2 5 directed at the underlying cause of a disease, or 

ameliorative, that is, directed at the symptoms of the 
disease, especially those which reduce the quality of life. 

It should also be understood that to be useful, the 
protection provided need not be absolute, provided that it 
30 is sufficient to carry clinical value. An agent which 

provides protection to a lesser degree than do competitive 
agents may still be of value if the other agents are 
ineffective for a particular individual, if it can be used 
in combination with other agents to enhance the level of 
35 protection, or if it is safer than competitive agents. It is 
desirable that there be a statistically significant (p=0.05 
or less) improvement in the treated subject relative to an 
appropriate untreated control, and it is desirable that this 
improvement be at least 10%, more preferably at least 2 5%, 
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still more preferably at least 50%, even more preferably at 
least 100%, in some indicia of the incidence or severity of 
the disease or of at least one symptom of the disease. 

At least one of the drugs of the present invention may 
be administered, by any means that achieve their intended 
purpose, to protect a subject against a disease or other 
adverse condition. The form of administration may be 
systemic or topical. For example, administration of such a 
composition may be by various parenteral routes such as 
subcutaneous, intravenous, intradermal, intramuscular, 
intraperitoneal, intranasal, transdermal, or buccal routes. 
Alternatively, or concurrently, administration may be by the 
oral route. Parenteral administration can be by bolus 
injection or by gradual perfusion over time. 

A typical regimen comprises administration of an 
effective amount of the drug, administered over a period 
ranging from a single dose, to dosing over a period of 
hours, days, weeks, months, or years. 

It is understood that the suitable dosage of a drug of 
the present invention will be dependent upon the age, sex, 
health, and weight of the recipient, kind of concurrent 
treatment, if any, frequency of treatment, and the nature of 
the effect desired. However, the most preferred dosage can 
be tailored to the individual subject, as is understood and 
determinable by one of skill in the art, without undue 
experimentation. This will typically involve adjustment of 
a standard dose, e.g., reduction of the dose if the patient 
has a low body weight . 

Prior to use in humans, a drug will first be evaluated 
for safety and efficacy in laboratory animals. In human 
clinical studies, one would begin with a dose expected to be 
safe in humans, based on the preclinical data for the drug 
in question, and on customary doses for analogous drugs (if 
any) . If this dose is effective, the dosage may be 
decreased, to determine the minimum effective dose, if 
desired. If this dose is ineffective, it will be cautiously 
increased, with the patients monitored for signs of side 
effects. See, e.g., Berkow et al, eds . , The Merck Manual, 
15th edition, Merck and Co., Rahway, N.J., 1987; Goodman et 
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al . , eds . , Goodman and Gilman's The Pharmacological Basis of 
Therapeutics, 8th edition, Pergamon Press, Inc., Elmsford, 
N.Y., (1990); Avery ' s Drug Trea tmen t : Princi pi es and 
Practice of Clinical Pharmacology and Therapeutics , 3rd 
5 edition, ADIS Press, LTD., Williams and Wilkins, Baltimore, 
MD. (1987), Ebadi, Pharmacology, Little, Brown and Co., 
Boston, (1985) , which references and references cited 
therein, are entirely incorporated herein by reference. 
The total dose required for each treatment may be 

10 administered by multiple doses or in a single dose. The 
protein may be administered alone or in conjunction with 
other therapeutics directed to the disease or directed to 
other symptoms thereof . 

Typical pharmaceutical doses, for adult humans, are in 

15 the range of 1 ng to lOg per day, more often 1 mg to lg per 
day . 

The appropriate dosage form will depend on the disease, 
the pharmaceutical, and the mode of administration; 
possibilities include tablets, capsules, lozenges, dental 
2 0 pastes, suppositories, inhalants, solutions, ointments and 
parenteral depots. See, e.g., Berker, supra, Goodman, 
supra, Avery, supra and Ebadi, supra, which are entirely 
incorporated herein by reference, including all references 
cited therein. 

25 In the case of peptide drugs, the drug may be 

administered in the form of an expression vector comprising 
a nucleic acid encoding the peptide; such a vector, after 
incorporation into the genetic complement of a cell of the 
patient, directs synthesis of the peptide. Suitable vectors 

30 include genetically engineered poxviruses (vaccinia) , 

adenoviruses, adeno-associated viruses, herpesviruses and 
lentiviruses which are or have been rendered nonpathogenic . 

In addition to at least one drug as described herein, a 
pharmaceutical composition may contain suitable 

35 pharmaceutical ly acceptable carriers, such as excipients, 
carriers and/or auxiliaries which facilitate processing of 
the active compounds into preparations which can be used 
pharmaceut ical ly . See, e.g., Berker, supra, Goodman, supra, 
Avery, supra and Ebadi, supra, which are entirely 
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incorporated herein by reference, included all references 
cited therein. 

Assay Compositions and Methods 

5 Target Organism 

The invention contemplates that it may be appropriate 
to ascertain or to mediate the biological activity of a 
substance of this invention in a target organism. 

The target organism may be a plant, animal, or 
10 microorganism. 

In the case of a plant, it may be an economic plant, in 
which case the drug may be intended to increase the disease, 
weather or pest resistance, alter the growth 
characteristics, or otherwise improve the useful 
15 characteristics or mute undesirable characteristics of the 
plant. Or it may be a weed, in which case the drug may be 
intended to kill or otherwise inhibit the growth of the 
plant, or to alter its characteristics to convert it from a 
weed to an economic plant. The plant may be a tree, shrub, 

2 0 crop, grass, etc. The plant may be an algae (which are in 

some cases also microorganisms), or a vascular plant, 
especially gymnosperms (particularly conifers) and 
angiosperms. Angiosperms may be monocots or dicots. The 
plants of greatest interest are rice, wheat, corn, alfalfa, 
25 soybeans, potatoes, peanuts, tomatoes, melons, apples, 

pears, plums, pineapples, fir, spruce, pine, cedar, and oak. 

If the target organism is .a microorganism, it may be 
algae, bacteria, fungi, or a virus (although the biological 
activity of a virus must be determined in a virus- infected 

3 0 cell) . The microorganism may be human or other animal or 

plant pathogen, or it may be nonpathogenic. It may be a 
soil or water organism, or one which normally lives inside 
other living things. 

If the target organism is an animal, it may be a 
35 vertebrate or a nonvertebrate animal. Nonvertebrate animals 
are chiefly of interest when they act as pathogens or 
parasites, and the drugs are intended to act as biocidic or 
biostatic agents. Nonvertebrate animals of interest include 
worms, mollusks, and arthropods. 
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The target organism may also be a vertebrate animal, 
i.e., a mammal, bird, reptile, fish or amphibian. Among 
mammals, the target animal preferably belongs to the order 
Primata (humans, apes and monkeys), Artiodactyla (e.g., 
5 cows, pigs, sheep, goats, horses), Rodenta (e.g., mice, 

rats) Lagomorpha (e.g., rabbits, hares), or Carnivora (e.g., 
cats, dogs) . Among birds, the target animals are preferably 
of the orders Anseriformes (e.g., ducks, geese, swans) or 
Galliformes (e.g., quails, grouse, pheasants, turkeys and 
10 chickens) . Among fish, the target animal is preferably of 
the order Clupeiformes (e.g., sardines, shad, anchovies, 
whitef ish, salmon) . 

Target Tissues 

15 The term "target tissue" refers to any whole animal, 

physiological system, whole organ, part of organ, 
miscellaneous tissue, cell, or cell component (e.g., the 
cell membrane) of a target animal in which biological 
activity may be measured. 

2 0 Routinely in mammals one would choose to compare and 

contrast the biological impact on virtually any and all 
tissues which express the subject receptor protein. The 
main tissues to use are: brain, heart, lung, kidney, liver, 
pancreas, skin, intestines, adipose, stomach, skeletal 
25 muscle, adrenal glands, breast, prostate, vasculature, 

retina, cornea, thyroid gland, parathyroid glands, thymus, 
bone marrow, bone, etc. 

Another classification would be by cell type: B cells, 
T cells, macrophages, neutrophils, eosinophils, mast cells, 

3 0 platelets, megakaryocytes, erythrocytes, bone marrow stomal 

cells, fibroblasts, neurons, astrocytes, neuroglia, 
microglia, epithelial cells (from any organ, e.g. skin, 
breast, prostate, lung, intestines etc), cardiac muscle 
cells, smooth muscle cells, striated muscle cells, 
35 osteoblasts, osteocytes, chondroblasts , chondrocytes, 
keratinocytes , melanocytes, etc. 

Of course, in the case of a unicellular organism, there 
is no distinction between the "target organism" and the 
"target tissue". 
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Screening Assays 

Assays intended to determine the binding or the 
biological activity of a substance are called preliminary 
5 screening assays. 

Screening assays will typically be either in vitro 
(cell-free) assays (for binding to an immobilized receptor) 
or cell -based assays (for alterations in the phenotype of 
the cell) . They will not involve screening of whole 
10 multicellular organisms, or isolated organs. The comments 
on diagnostic biological assays apply mutatis mutandis to 
screening cell -based assays. 

In Vitro vs . In Vivo Assays 

15 The term in vivo is descriptive of an event, such as 

binding or enzymatic action, which occurs within a living 
organism. The organism in question may, however, be 
genetically modified. The term in vitro refers to an event 
which occurs outside a living organism. Parts of an 

20 organism (e.g., a membrane, or an isolated biochemical) are 
used, together with artificial substrates and/or conditions. 
For the purpose of the present invention, the term in vitro 
excludes events occurring inside or on an intact cell, 
whether of a unicellular or multicellular organism. 

25 In vivo assays include both cell-based assays, and 

organismic assays. The cell -based assays include both assays 
on unicellular organisms, and assays on isolated cells or 
cell cultures derived from multicellular organisms. The 
cell cultures may be mixed, provided that they are not 

30 organized into tissues or organs. The term organismic assay 
refers to assays on whole multicellular organisms, and 
assays on isolated organs or tissues of such organisms. 

In vitro Diagnostic Methods and Reagents 

35 

The in vitro assays of the present invention may be 
applied to any suitable analyte-containing sample, and may 
be qualitative or quantitative in nature. 



87 

Sample 

The sample will normally be a biological fluid, such as 
blood, urine, lymph, semen, milk, or cerebrospinal fluid, or 
a fraction or^derivat ive thereof, or a biological tissue, in 
5 the form of, e.g., a tissue section or homogenate. However, 
the sample conceivably could be (or derived from) a food or 
beverage, a pharmaceutical or diagnostic composition, soil, 
or surface or ground water. If a biological fluid or 
tissue, it may be taken from a human or other mammal, 
10 vertebrate or animal, or from a plant. The preferred sample 
is blood, or a fraction or derivative thereof. 

Binding and Reaction Assays 

The assay may be a binding assay, in which one step 

15 involves the binding of a diagnostic reagent to the analyte, 
or a reaction assay, which involves the reaction of a 
reagent with the analyte. The reagents used in a binding 
assay may be classified as to the nature of their 
interaction with analyte: (1) analyte analogues, or (2) 

20 analyte binding molecules (ABM) . They may be labeled or 
insolubilized . 

In a reaction assay, the assay may look for a direct 
reaction between the analyte and a reagent which is reactive 
with the analyte, or if the analyte is an enzyme or enzyme 

25 inhibitor, for a reaction catalyzed or inhibited by the 

analyte. The reagent may be a reactant, a catalyst, or an 
inhibitor for the reaction. 

An assay may involve a cascade of steps in which the 
product of one step acts as the target for the next step. 

3 0 These steps may be binding steps, reaction steps, or a 
combination thereof. 

Signal Producing System (SPS) 

3 5 In order to detect the presence, or measure the amount, 

of an analyte, the assay must provide for a signal producing 
system (SPS) in which there is a detectable difference in 
the signal produced, depending on whether the analyte is 
present or absent (or, in a quantitative assay, on the 
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amount of the analyte) . The detectable signal may be one 
which is visually detectable, or one detectable only with 
instruments. Possible signals include production of colored 
or luminescent products, alteration of the characteristics 
5 (including amplitude or polarization) of absorption or 

emission of radiation by an assay component or product, and 
precipitation or agglutination of a component or product. 
The term "signal" is intended to include the discontinuance 
of an existing signal, or a change in the rate of change of 

10 an observable parameter, rather than a change in its 

absolute value. The signal may be monitored manually or 
automatically . 

In a reaction assay, the signal is often a product of 
the reaction. In a binding assay, it is normally provided 

15 by a label borne by a labeled reagent . 

Labels 

The component of the signal producing system which is 
most intimately associated with the diagnostic reagent is 

20 called the "label". A label may be, e.g., a radioisotope, a 
fluorophore, an enzyme, a co-enzyme, an enzyme substrate, an 
electron-dense compound, an agglutinable particle. 

The radioactive isotope can be detected by such means 
as the use of a gamma counter or a scintillation counter or 

25 by autoradiography. Isotopes which are particularly useful 
for the purpose of the present invention include 3 H, 125 1 , 
131 I, 35 S, 14 C, 32 P and 33 P. 125 I is preferred for antibody 
labeling . 

The label may also be a fluorophore. When the 
3 0 • f luorescently labeled reagent is exposed to light of the 

proper wave length, its presence can then be detected due to 
fluorescence. Among the most commonly used fluorescent 
labeling compounds are fluorescein isothiocyanate , 
rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o- 
3 5 phthaldehyde and f luorescamine . 

Alternatively, fluorescence-emitting metals such as 
125 Eu, or others of the lanthanide series, may be 
incorporated into a diagnostic reagent using such metal 
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chelating groups as diethylenetriaminepentaacet ic acid 
(DTPA) of ethylenediamine-tetraacetic acid (EDTA) . 

The label may also be a chemiluminescent compound. The 
presence of the chemiluminescently labeled reagent is then _ 
5 determined by detecting the presence of luminescence that 
arises during the course of a chemical reaction. Examples 
of particularly useful chemiluminescent labeling compounds 
are luminol , isolumino, theromatic acridinium ester, 
imidazole, acridinium salt and oxalate ester. 

10 Likewise, a bioluminescent compound may be used for 

labeling. Bioluminescence is a type of chemiluminescence 
found in biological systems in which a catalytic protein 
increases the efficiency of the chemiluminescent reaction. 
The presence of a bioluminescent protein is determined by 

15 detecting the presence of luminescence. Important 

bioluminescent compounds for purposes of labeling are 
luciferin, luciferase and aequorin. 

Enzyme labels, such as horseradish peroxidase and 
alkaline phosphatase, are preferred. When an enzyme label 

2 0 is used, the signal producing system must also include a 

substrate for the enzyme. If the enzymatic reaction product 
is not itself detectable, the SPS will include one or more 
additional reactants so that a detectable product appears. 

An enzyme analyte may act as its own label if an enzyme 

25 inhibitor is used as a diagnostic reagent. 

Binding Assay Formats 

Binding assays may be divided into two basic types, 
heterogeneous and homogeneous. In heterogeneous assays, the 

30 interaction between the affinity molecule and the analyte 

does not affect the label, hence, to determine the amount or 
presence of analyte, bound label must be separated from free 
label. In homogeneous assays, the interaction does affect 
the activity of the label, and therefore analyte levels can 

35 be deduced without the need for a separation step. 

In one embodiment, the ABM is insolubilized by coupling 
it to a macromolecular support, and analyte in the sample is 
allowed to compete with a known quantity of a labeled or 
specifically labelable analyte analogue. The "analyte 
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analogue" is a molecule capable of competing with analyte 
for binding to the ABM, and the term is intended to include 
analyte itself. It may be labeled already, or it may be 
labeled subsequently by specifically binding the label to a 
5 moiety differentiating the analyte analogue from analyte. 

The solid and liquid phases are separated, and the labeled 
analyte analogue in one phase is quantified. The higher the 
level of analyte analogue in the solid phase, i.e., 
sticking to the ABM, the lower the level of analyte in the 
10 sample. 

In a "sandwich assay", both an insolubilized ABM, and a 
labeled ABM are employed. The analyte is captured by the 
insolubilized ABM and is tagged by the labeled ABM, forming 
a ternary complex. The reagents may be added to the sample 
15 in either order, or simultaneously. The ABMs may be the 

same or different. The amount of labeled ABM in the ternary 
complex is directly proportional to the amount of analyte in 
the sample . 

The two embodiments described above are both 

20 heterogeneous assays. However, homogeneous assays are 
conceivable. The key is that the label be affected by 
whether or not the complex is formed. 
Conjugation Methods 

A label may be conjugated, directly or indirectly 

25 (e.g., through a labeled anti-ABM antibody), covalently 

(e.g., with SPDP) or noncovalently , to the ABM, to produce a 
diagnostic reagent. Similarly, the ABM may be conjugated to 
a solid phase support to form a solid phase ("capture") 
diagnostic reagent. 

3 0 Suitable supports include glass, polystyrene, 

polypropylene, polyethylene, dextran, nylon, amylases, 
natural and modified celluloses, polyacrylamides , agaroses, 
and magnetite. The nature of the carrier can be either 
soluble to some extent or insoluble for the purposes of the 

35 present invention. 

The support material may have virtually any possible 
structural configuration so long as the coupled molecule is 
capable of binding to its target. Thus the support 
configuration may be spherical, as in a bead, or 
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cylindrical, as in the inside surface of a test tube, or the 
external surface of a rod. Alternatively, the surface may 
be flat such as a sheet, test strip, etc. 

5 Biological Assays 

A biological assay measures or detects a biological 
response of a biological entity to a substance. 

The biological entity may be a whole organism, an 
isolated organ or tissue, freshly isolated cells, an 

10 immortalized cell line, or a subcellular component (such as 
a membrane; this term should not be construed as including 
an isolated receptor) . The entity may be, or may be derived 
from, an organism which occurs in nature, or which is 
modified in some way. Modifications may be genetic 

15 (including radiation and chemical mutants, and genetic 

engineering) or somatic (e.g., surgical, chemical, etc.). 
In the case of a multicellular entity, the modifications may 
affect some or all cells. The entity need not be the target 
organism, or a derivative thereof, if there is a reasonable 

20 correlation between bioassay activity in the assay entity 
and biological activity in the target organism. 

The entity is placed in a particular environment, which 
may be more or less natural. For example, a culture medium 
may, but need not, contain serum or serum substitutes, and 

25 it may, but need not, include a support matrix of some kind, 
it may be still, or agitated. It may contain particular 
biological or chemical agents, or have particular physical 
parameters (e.g., temperature), that are intended to nourish 
or challenge the biological entity. 

30 There must also be a detectable biological marker for 

the response. At the cellular level, the most common 
markers are cell survival and proliferation, cell behavior 
(clustering, motility) , cell morphology (shape, color) , and 
biochemical activity (overall DNA synthesis, overall protein 

35 synthesis, and specific metabolic activities, such as 

utilization of particular nutrients, e.g., consumption of 
oxygen, production of CO z , production of organic acids, 
uptake or discharge of ions) . 
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The direct signal produced by the biological marker may- 
be transformed by a signal producing system into a different 
signal which is more observable, for example, a fluorescent 
or colorimetric signal . 
5 The entity, environment, marker and signal producing 

system are chosen to achieve a clinically acceptable level 
of sensitivity, specificity and accuracy. 

In some cases, the goal will be to identify substances 
which mediate the biological activity of a natural 

10 biological entity, and the assay is carried out directly 

with that entity. In other cases, the biological entity is 
used simply as a model of some more complex (or otherwise 
inconvenient to work with) biological entity. In that 
event, the model biological entity is used because activity 

15 in the model system is considered more predictive of 

activity in the ultimate natural biological entity than is 

v 

simple binding activity in an in vitro system. The model 
entity is used instead of the ultimate entity because the 
former is more expensive or slower to work with, or because 

2 0 ethical considerations forbid working with the ultimate 

entity yet . 

The model entity may be naturally occurring, if the 
model entity usefully models the ultimate entity under some 
conditions. Or it may be non-naturally occurring, with 
25 modifications that increase its resemblance to the ultimate 
entity. 

Transgenic animals, such as transgenic mice, rats, and 
rabbits, have been found useful as model systems. 

In cell -based model assays, where the biological 

3 0 activity is mediated by binding to a receptor (target 

protein) , the receptor may be functionally connected to a 
signal (biological marker) producing system, which may be 
endogenous or exogenous to the cell. 
There are a number of techniques of doing this . 

35 

"Zero -Hybrid" Systems 

In these systems, the binding of a peptide to the 
target protein results in a screenable or selectable . 
phenotypic change, without resort to fusing the target 
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protein (or a ligand binding moiety thereof) to an 
endogenous protein. It may be that the target protein is 
endogenous to the host cell, or is substantially identical 
to an endogenous receptor so that it can take advantage of 
5 the latter' s native signal transduction pathway. Or 

sufficient elements of the signal transduction pathway 
normally associated with the target protein may be 
engineered into the cell so that the cell signals binding to 
the target protein. 

10 

"One -Hybrid" Systems 

In these systems, a chimera receptor, a hybrid of the 
target protein and an endogenous receptor, is used. The 
chimeric receptor has the ligand binding characteristics of 
15 the target protein and the signal transduction 

characteristics of the endogenous receptor. Thus, the 
normal signal transduction pathway of the endogenous 
receptor is subverted. 

Preferably, the endogenous receptor is inactivated, or 
20 the conditions of the assay avoid activation of the 

endogenous receptor, to improve the signal-to-noise ratio. 

See Fowlkes USP 5,789,184 for a yeast system. 

Another type of "one-hybrid" system combines a peptide: 
DNA-binding domain fusion with an unfused target receptor 

2 5 that possesses an activation domain. 

"Two-Hybrid" System 

In a preferred embodiment, the cell -based assay is a 

two hybrid system. This term implies that the ligand is 

3 0 incorporated into a first hybrid protein, and the receptor 

into a second hybrid protein. The first hybrid also 
comprises component A of a signal generating system, and the 
second hybrid comprises component B of that system. 
Components A and B, by themselves, are insufficient to 
35 generate a signal. However, if the ligand binds the 

receptor, components A and B are brought into sufficiently 
close proximity so that they can cooperate to generate a 
signal . 
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Components A and B may naturally occur, or be 
substantially identical to moieties which naturally occur, 
as components of a single naturally occurring biomolecule, 
or they may naturally occur, or be substantially identical 
5 to moieties which naturally occur, as separate naturally 
occurring biomolecules which interact in nature. 

Two-Hybrid System: Transcription Factor Type 

In a preferred "two-hybrid" embodiment, one member of a 

10 peptide ligand : receptor binding pair is expressed as a 

fusion to a DNA-binding domain (DBD) from a transcription 
factor (this fusion protein is called the "bait"), and the 
other is expressed as a fusion to a transact ivation domain 
(TAD) (this fusion protein is called the "fish", the "prey", 

15 or the "catch") . The transact ivation domain should be 

complementary to the DNA-binding domain, i.e., it should 
interact with the latter so as to activate transcription of 
a specially designed reporter gene that carries a binding 
site for the DNA-binding domain. Naturally, the two fusion 

20 proteins must likewise be complementary. 

This complementarity may be achieved by use of the 
complementary and separable DNA-binding and transcriptional 
activator domains of a single transcriptional activator 
protein, or one may use complementary domains derived from 

25 different proteins. The domains may be identical to the 

native domains, or mutants thereof. The assay members may 
be fused directly to the DBD or TAD, or fused through an 
intermediated linker. 

The target DNA operator may be the native operator 

3 0 sequence, or a mutant operator. Mutations in the operator 
may be coordinated with mutations in the DBD and the TAD. 
An example of a suitable transcription activation system is 
one comprising the DNA-binding domain from the bacterial 
repressor LexA and the activation domain from the yeast 

3 5 transcription factor Gal4, with the reporter gene operably 
linked to the LexA operator. 

It is not necessary to employ the intact target 
receptor; just the ligand-binding moiety is sufficient. 
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The two fusion proteins may be expressed from the same 
or different vectors. Likewise, the activatable reporter 
gene may be expressed from the same vector as either fusion 
protein (or both proteins) , or from a third vector. 
5 Potential DNA-binding domains include Gal4, LexA, and 

mutant domains substantially identical to the above. 

Potential activation domains include E. coli B42, Gal4 
activation domain II, and HSV VP16, and mutant domains 
substantially identical to the above. 
10 Potential operators include the native operators for 

the desired activation domain, and mutant domains 
substantially identical to the native operator. 

The fusion proteins may comprise nuclear localization 
signals . 

15 The assay system will include a signal producing 

system, too. The first element of this system is a reporter 
gene operably linked to an operator responsive to the DBD 
and TAD of choice. The expression of this reporter gene 
will result, directly or indirectly, in a selectable or 

2 0 screenable phenotype (the signal) . The signal producing 

system may include, besides the reporter gene, additional 
genetic or biochemical elements which cooperate in the 
production of the signal. Such an element could be, for 
example, a selective agent in the cell growth medium. There 
25 may be more than one signal producing system, and the system 
may include more than one reporter gene. 

The sensitivity of the system may be adjusted by, e.g., 
use of competitive inhibitors of any step in the activation 
or signal production process, increasing or decreasing the 

3 0 number of operators, using a stronger or weaker DBD or TAD, 

etc . 

When the signal is the death or survival of the cell in 
question, or proliferation or. nonprolif erat ion of the cell 
in question, the assay is said to be a selection. When the 
35 signal merely results in a detectable phenotype by which the 
signaling cell may be differentiated from the same cell in a 
nonsignaling state (either way being a living cell) , the 
assay is a screen. However, the term "screening assay" may 
be used in a broader sense to include a selection. When the 
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narrower sense is intended, we will use the term 
"nonselective screen" . 

Various screening and selection systems are discussed 
in Ladner, USP 5,198,346. 
5 Screening and selection may be for or against the 

peptide: target protein or compound : target protein 
interaction . 

Preferred assay cells are microbial (bacterial, yeast, 
algal, protozooal) , invertebrate, vertebrate (esp. 

10 mammalian, particularly human) . The best developed two- 
hybrid assays are yeast and. mammalian systems. 

Normally, two hybrid assays are used to determine 
whether a protein X and a protein Y interact, by virtue of 
their ability to reconstitute the interaction of the DBD and 

15 the TAD. However, augmented two-hybrid assays have been 
used to detect interactions that depend on a third, non- 
protein ligand. 

For more guidance on two-hybrid assays, see Brent and 
Finley, Jr., Ann. Rev. Genet., 31:663-704 (1997); Fremont- 

20 Racine, et al . , Nature Genetics, 277-281 (16 July 1997); 

Allen, et al . , TIBS, 511-16 (Dec. 1995); LeCrenier, et al . , 
BioEssays, 20:1-6 (1998); Xu, et al . , Proc . Nat. Acad. sci. 
(USA), 94:12473-8 (Nov. 1992); Esotak, et al . , Mol . Cell. 
Biol., 15:5820-9 (1995); Yang, et al . , Nucleic Acids Res., 

25 23:1152-6 (1995); Bendixen, et al . , Nucleic Acids Res., 

22:1778-9 (1994); Fuller, et al . , BioTechniques , 25:85-92 
(July 1998); Cohen, et al . , PNAS (USA) 95:14272-7 (1998); 
Kolonin and Finley, Jr., PNAS (USA) 95:14266-71 (1998). See 
also Vasavada, et al . , PNAS (USA) , 88 : 10686-90 (1991) 

30 (contingent replication assay), and Rehrauer, et al . , J. 

Biol. Chem. , 271:23865-73 91996) (LexA repressor cleavage 
assay) . 

Two-Hybrid Systems: reporter Enzyme type 
35 In another embodiment, the components A and B 

reconstitute an enzyme which is not a transcription factor. 
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As in the last example, the effect of the 
reconstitut ion of the enzyme is a phenotypic change which 
may be a screenable change, a selectable change, or both. 



5 In vivo Diagnostic Uses 

Radio- labeled ABM may be administered to the human or 
animal subject. Administration is typically by injection, 
e.g., intravenous or arterial or other means of 
administration in a quantity sufficient to permit subsequent 

10 dynamic and/or static imaging using suitable radio-detecting 
devices. The dosage is the smallest amount capable of 
providing a diagnostically effective image, and may be 
determined by means conventional in the art, using known 
radio- imaging agents as a guide. 

15 Typically, the imaging is carried out on the whole body 

of the subject, or on that portion of the body or organ 
relevant to the condition or disease under study. The 
amount of radio- labeled ABM accumulated at a given point in 
time in relevant target organs can then be quantified. 

2 0 A particularly suitable radio-detecting device is a 

scintillation camera, such as a gamma camera. A 
scintillation camera is a stationary device that can be used, 
to image distribut ion of radio- labeled ABM. The detection 
device in the camera senses the radioactive decay, the 

2 5 distribution of which can be recorded. Data produced by the 

imaging system can be digitized. The digitized information 
can be analyzed over time discontinuously or continuously. 
The digitized data can be processed to produce images, 
called frames, of the pattern of uptake of the radio-labeled 

3 0 ABM in the target organ at a discrete point in time. In 

most continuous (dynamic) studies, quantitative data is 
obtained by observing changes in distributions of 
radioactive decay in target organs over time. In other 
words, a time-activity analysis of the data will illustrate 
35 uptake through clearance of the radio-labeled binding 
protein by the target organs with time. 

Various factors should be taken into consideration in 
selecting an appropriate radioisotope. The radioisotope 
must be selected with a view to obtaining good quality 
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resolution upon imaging, should be safe for diagnostic use 
in humans and animals, and should preferably have a short 
physical half-life so as to decrease the amount of radiation 
received by the body. The radioisotope used should 
5 preferably be pharmacologically inert, and, in the 

quantities administered, should not have any substantial 
physiological effect . 

The ABM may be radio-labeled with different isotopes of 
iodine, for example 123 1 , 125 1 , or 131 I (see for example, U.S. 

10 Patent 4,609,725). The extent of radio-labeling must, 

however be monitored, since it will affect the calculations 
made based on the imaging results (i.e. a diiodinated ABM 
will result in twice the radiation count of a similar 
monoiodinated ABM over the same time frame) . 

15 In applications to human subjects, it may be desirable 

to use radioisotopes other than 12S I for labeling in order to 
decrease the total dosimetry exposure of the human body and 
to optimize the detectability of the labeled molecule 
(though this radioisotope can be used if circumstances 

20 require) . Ready availability for clinical use is also a 
factor. Accordingly, for human applications, preferred 
radio-labels are for example, 99m Tc, 67 Ga, 68 Ga, 90 Y, in In, 
113m In, 123 I, 186 Re, 188 Re or 211 At . 

The radio- labeled ABM may be prepared by various 

25 methods. These include radio-halogenat ion by the chloramine 
- T method or the lactoperoxidase method and subsequent 
purification by HPLC (high pressure liquid chromatography) , 
for example as described by J. Gutkowska et al in 
"Endocrinology and Metabolism Clinics of America: (1987) 16. 

30 (1) :183. Other known methods of radio-labeling can be used, 
such as IODOBEADS™. 

There are a number of different methods of delivering 
the radio- labeled ABM to the end-user. It may be 
administered by any means that enables the active agent to 

35 reach the agent's site of action in the body of a mammal. 
Because proteins are subject to being digested when 
administered orally, parenteral administration, i.e., 
intravenous, subcutaneous, intramuscular, would ordinarily 
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be used to optimize absorption of an ABM, such as an 
antibody, which is a protein. 



5 
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EXAMPLES 

We are utilizing a mouse model of diet-induced obesity 
that progresses to diabetes. The diet is high in fat and 
has been documented to lead to diabetes in C57BL/6J mice 
5 (Surwit at al . , 1988). After weaning, C57BL/6J mice were 

fed either the high fat diet or a standard lab chow diet for 
16 weeks. Body weight was monitored bi-weekly. Fasting 
glucose and insulin levels were measured after 2, 4, 8, and 
16 weeks on the diets. At each time point, several diabetic 
10 and control mice were sacrificed and a number of tissues 

collected. For further analysis, RNA was extracted from the 
gastrocnemius muscles at each time point and used in DNA 
microarray analyses. 

15 Animal Models. 

Obesity and subsequent hyper insulinemia and 
hyperglycemia were induced by feeding a group of 3 week old 
mice (50 C57BL/6 males) a high-fat diet (Bio-Serve, 
Frenchtown, NJ, #F1850 High Carbohydrate -High Fat; 56% of ■ 

20 calories from fat, 16% from protein and 27% from 

carbohydrates) . Another group of 3 week old mice (20 
C57B1/6 males) were fed the normal control diet (PMI 
Nutrition International Inc., Brentwood, MO, Prolab RMH3000; 
14% of calories from fat, 16% from protein and 60% from 

25 carbohydrates) . The mice were placed onto the respective 
diets immediately following weaning. Animal weights were 
determined weekly. Fasting blood-glucose and plasma insulin 
measurements were determined after 2, 4, 8 and 16 weeks on 
the respective diets. 

3 0 The day after obtaining body weight measurements at the 

indicated time points, mice were fasted 8 hours and blood 
glucose concentrations were measured via tail blood samples 
using a One Touch Glucometer (Lifescan) . For insulin 
measurements, blood was collected into heparinized tubes, 

35 plasma obtained by centrif ugation and insulin concentrations 
determined using an Ultra- Sensitive Rat Insulin ELISA kit 
(ALPCO) as instructed by the manufacturer. Values were 
adjusted by a factor of 1.23 as determined by the 
manufacturer to correct for species difference in cross- 
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reactivity with the antibody (bottom panel) . Results 
reflect mean ± SE of 50 mice on the HF diet and 20 mice on 
the Std diet . 

Normal weight, normal fasting blood glucose and normal 
5 fasting plasma insulin levels are defined as the respective 
mean values of the animals fed the control diet. 

Two of the "most typical" animals were selected for 
each group (Control, hyperinsulinemic and Diabetic) at each 
time point ( 2,4, 8, and 16 weeks after commencement of 
10 diet) for sacrifice. The selected mice were sacrificed and 
muscle tissue obtained and immediately processed for RNA 
isolation. 

Fasting Blood Glucose Levels. 

15 Blood glucose levels was measured from a drop of blood 

taken from the tip of the tail of fasted (8 hr) mice using a 
Lifescan Genuine One Touch glucometer. All measurements 
occurred between 2:00 pm and 5:00 pm. 

2 0 Plasma insulin measurements. 

Blood was collected from the tail of fasted (8 hr) mice 
into a heparinized capillary tube and stored on ice. All 
collections occurred between 2:00 pm and 5:00 pm. Plasma 
was separated from red blood cells by centrif ugat ion for 10 

25 minutes at 8000 x g and then stored at -20 'C. Insulin 

concentrations were determined using the Rat Insulin ELISA 
kit and rat insulin standards (ALPCO) essentially as 
instructed by the manufacturer. Values were adjusted by a 
factor of 1.23 as determined by the manufacturer to correct 

30 for the species difference in cross-reactivity with the 
antibody . 

RNA isolation. 

Total RNA was isolated from muscle (skeletal muscle, 
35 specifically, gastrocnemius) of two mice at each time point 
during the progression of HF diet- induced type 2 diabetes, 
as well as age-matched controls on the Std diet, using the 
RNA STAT- 60 Total RNA/mRNA Isolation Reagent according to 
the manufacturer's instructions (Tel-Test, Friendswood, TX) . 
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Sample Quantification and Quality Assessment 

Total RNA was quantified and assessed for quality on a 
Bioanalyzer RNA 6000 Nano chip (Agilent) . Each chip 
5 contained an interconnected set of gel -filled channels that 
allowed for molecular sieving of nucleic acids. Pin- 
electrodes in the chip were used to create electrokinetic 
forces capable of driving molecules through these micro- 
channels to perform electrophoret ic separations. Ribosomal 
10 peaks were measured by fluorescence signal and displayed in 
an elect ropherogram. A successful total RNA sample featured 
2 distinct ribosomal peaks (18S and 28S rRNA) . 

Biotinylated cRNA Hybridization Target. 

15 Total RNA was prepared for use as a hybridization 

target as described in the manufacturer's instructions for 
CodeLink Expression Bioarrays (TM) (Amersham Biosciences) . 
The CodeLink Expression Bioarrays utilize nucleic acid 
hybridization of a biotin-labeled complementary RNA(cRNA) 

20 target with DNA oligonucleotide probes attached to a gel 
matrix . 

The biotin-labeled cRNA target is prepared by a linear 
amplification method. Poly (A) + RNA (within the total RNA 
population) is primed for reverse transcription by a DNA 

25 oligonucleotide containing a T7 RNA polymerase promoter 5' 
to a (dT) 24 sequence. After second- strand cDNA synthesis, 
the cDNA serves as the template in an in vitro transcription 
(IVT) reaction to produce the target cRNA. The IVT is 
performed in the presence of biotinylated nucleotides to 

30 label the target cRNA. This procedure results in a 50-200 
fold linear amplification of the input poly (A) + RNA. 

Hybridization Probes. 

The oligonucleotide probes were provided by the 
3 5 Codelink Uniset Mouse I Bioarray (Amersham, product code 
300013) . Amine- terminated oligonucleotide probes are 
attached to a three-dimensional polyacrylamide gel matrix. 
There are 10,000 oligonucleotide probes, each specific to a 
well-characterized mouse gene. Each mouse gene is 
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representative of a unique gene cluster from the fourth 
quarter 2 0 01 Genbank Unigene build. There are also 50 0 
control probes. 

The sequences of the probes are proprietary to 
5 Amersham. However, for each probe, Amersham identifies the 
corresponding mouse gene by NCBI accession number, OGS, 
LocusLink, Unigene Cluster ID, and description (name) . 
This information should be available from Amersham. In the 
case of the differentially expressed probes, this 
10 information is duplicated in master table 1. For the 
complete list, see 

http : / /www4 . amershambiosciences . com/ aptrix/upp01077 . nsf /Cont 
ent/codelink_literature 

15 Under "Gene Lists", select "Uniset Mouse I", and a gene 
list, in Excel format, can be downloaded. 

Hybridization 

Using the cRNA target, the hybridization reaction 
2 0 mixture is prepared and loaded into array chambers for 
bioarray processing as set forth in the manufacturer's 
instructions for CodeLink Gene Expression BioarraysTM 
(Amerhsam Biosciences) . Each sample is hybridized to an 
individual microarray. Hybridization is at 3 7°C. The 

2 5 hybridization buffer is prepared as set forth in the 

Motorola instructions. Hybridization to the microarray is 
detected with an avidinated fluorescent reagent, 
Streptavidin-Alexa Fluor ® 647 (Amersham) . 

3 0 Mouse Gene Expression Analysis 

Processed arrays were scanned using a GenePix 4 0 00B 
Microarray Scanner (Axon Instruments, Inc.); array images 
were acquired using the Amersham CodeLink™ Analysis Software 
(Release 2.2) . The Amersham CodeLink™ Analysis Software 
3 5 gives an integrated optical density (IOD) value for every 

spot; a unique background value for that spot is subtracted, 
resulting in "raw" data points. Individual chips are then 
normalized by the Amersham Codelink™ software according to 
the median raw intensity for all 10,000 genes. A negative 
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control threshold (0.2) is also calculated according to the 
control probes. The expression data was analyzed to identify 
genes whose expression levels changed significantly with 
respect to: 

5 

Normal mice compared to hyperinsulinemic mice at 2 , 4, 
8 and 16 weeks on normal vs. high-fat diet. 

Normal mice compared to hyperinsulinemic/hyperglycemic 
10 mice at 2 , 4, 8 and 16 weeks on normal vs. high- fat 

diet . 

Hyperinsulinemic compared to 

hyperinsulinemic/hyperglycemic mice at 2 , 4, 8 and 16 
15 weeks on high-fat diets. 

Database Searches Nucleotide sequences and predicted amino 
acid sequences were compared to public domain databases 
using the Blast 2 . 0 program (National Center for 

20 Biotechnology Information, National Institutes of Health) . 
Nucleotide sequences were displayed using ABI prism Edit 
View 1.0.1 (PE Applied Biosystems, Foster City, CA) . 

Nucleotide database searches were conducted with the 
then current version of BLASTN 2.0.12, see Altschul, et al . , 

25 "Gapped BLAST and PSI -BLAST: a new generation of protein 

database search programs", Nucleic Acids Res., 25:3389-3402 
(1997) . Searches employed the default parameters, unless 
otherwise stated. 

For blastN searches, the default was the blastN matrix 

30 (1,-3), with gap penalties of 5 for existence and 2 for 

extension . 

Protein database searches were conducted with the then- 
current version of BLAST X, see Altschul et al . (1997), 
supra . Searches employed the default parameters, unless 
35 otherwise stated. The scoring matrix was BLOSUM62 , with gap 
costs of 11 for existence and 1 for extension. The standard 
low complexity filter was used. 

"ref" indicates that NCBI ' s RefSeq is the source 
database. The identifier that follows is a RefSeq accession 
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number, not a GenBank accession number. "RefSeq sequences 
are derived from GenBank and provide non- redundant curat ed 
data representing our current knowledge of known genes. Some 
records include additional sequence information that was 
5 never submitted to an archival database but is available in 
the literature. A small number of sequences are provided 
through collaboration; the underlying primary sequence data 
is available in GenBank, but may not be available in any one 
GenBank record. RefSeq sequences are not submitted primary 

10 sequences. RefSeq records are owned by NCBI and therefore 
can be updated as needed to maintain current annotation or 
to incorporate additional sequence information." See also 
http : / /www . ncbi . nlm.nih . qov/LocusLink/ref sea . html 

It will be appreciated by those in the art that the 

15 exact results of a database search will change from day to 
day, as new sequences are added. Also, if you query with a 
longer version of the original sequence, the results will 
change. The results given here were obtained at one time 
and no guarantee is made that the exact same hits would be 

20 obtained in a search on the filing date. However, if an 
alignment between a particular query sequence and a 
particular database sequence is discussed, that alignment 
should not change (if the parameters and sequences remain 
unchanged) . 

25 

Northern Analysis. 

Northern analysis may be used to confirm the results. 
Favorable and unfavorable genes, identified as described 

3 0 above, or fragments thereof, will be used as probes in 
Northern hybridization analyses to confirm their 
differential expression. Total RNA isolated from subject 
mice will be resolved by agarose gel electrophoresis through 
a 1% agarose, 1 % formaldehyde denaturing gel, transferred 

3 5 to positively charged nylon membrane, and hybridized to a 
probe labeled with [32P] dCTP that was generated from the 
aforementioned gene or fragment using the Random Primed DNA 
Labeling Kit (Roche, Palo Alto, CA) , or to a probe labeled 
with digoxigenin (Roche Molecular Biochemicals , 



106 

Indianapolis, IN), according to the manufacturer's 
instructions . 

Real-Time RNA Analysis. 

Real-time RNA analysis may also be used for 
confirmation. For "real-time" RNA analysis, RNA will be 
converted to cDNA and then probed with gene-specific primers 
made for each clone. "Real-time" incorporation of 
fluorescent dye will be measured to determine the amount of 
specific transcript present in each sample. Sample 
differences (control vs. hyper insulinemic , hyperinsulinemic 
vs. diabetic, or control vs. diabetic) will be evaluated. 
Confirmation using several independent animals is desirable. 

In situ Hybridization 

Another form of confirmation may be provided by 
nonisotopic in situ hybridizations (NISH) on selected human 
(obtained by Tissue Informatics) and mouse tissues using 
cRNA probes generated from mouse genes found to be up- or 
down- regulated during the disease progression. In situ 
hybridizations may also be performed on mouse tissues using 
cRNA probes generated from differentially expressed DNAs . 
These cRNA' s will hybridize to their corresponding messenger 
RNA's present in cells and will provide information 
regarding the particular cell types within a tissue that is 
expressing the particular gene as well as the relative level 
of gene expression. The cRNA probes may be generated by in 
vitro transcription of template cDNA by Sp6 or T7 RNA 
polymerase in the presence of digoxigenin-ll-UTP (Roche 
Molecular Biochemicals , Mannheim, Germany; Pardue, M.L. 
1985. In: In situ hybridization, Nucleic acid 
hybridization, a practical approach: IRL Press, Oxford, 179- 
202) . 

Transgenic Animals. 

Transgenic expression may be used to confirm the results. 
In one embodiment , a mouse is engineered to overexpress the 
favorable or unfavorable mouse gene in question. In another 
embodiment, a mouse is engineered to express the 



107 

corresponding favorable or unfavorable human gene. In a 
third embodiment, a nonhuman animal other than a mouse, such 
as a rat, rabbit, goat, sheep or pig, is engineered to 
express the favorable or unfavorable mouse or human gene. 

5 

Hyperquantitative Tissue Analysis 

In addition to gene expression analysis the tissue 
sections can also be analyzed using Tissuelnf ormat ics , 
Inc.'s TissueAnalytics™ software. A single representative 

10 section may be cut from each tissue block, placed on a 

slide, and stained with H&E. Digital images of each slide 
may be acquired using an research microscope and digital 
camera (Olympus E600 microscope and Sony DKC-ST5) . These 
images may be acquired at 20x magnification with a 

15 resolution of 0.64 mm/pixel. A hyperquantitative analysis 
may be performed on the resulting images: First a digital 
image analysis can identify and annotate structural objects 
in a tissue using machine vision. These objects, which are 
constituents of the tissue, can be annotated because they 

20 are visually identifiable and have a biological meaning. 

Subsequently a quantification of these structures regarding 
their geometric properties like area or stain intensities 
and their relationship to the field of view or per unit area 
in terms of a % coverage may be performed. Features or 

25 parameters for hyper-quantification are specific for each 
tissue, and may also include relations between features, 
measures of overall heterogeneity, including orientation, 
relative locations, and textures. 

3 0 Correlation Analysis 

Mathematical statistics provides a rich set of additional 
tools to analyze time resolved data sets of hyper- 
quantitative and gene expression profiles for similarities, 
including rank correlation, the calculation of regression 
35 and correlation coefficients, and clustering. Continuous 

functions may also be fitted through the data points of 1 
individual gene and tissue feature data. Relation between 
gene expression and hyper -quantitative tissue data may be 
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linear or non- linear, in synchronous or asynchronous 
arrangements . 



5 Example 1 

Obesity is increasing at an alarming 1 rate in the United 
States. In parallel, the incidence of type II diabetes is 
also rising. We are interested in defining alterations in 
gene expression that correlate with the development of these 

10 conditions in the hopes of reversing these dangerous trends. 
Insulin plays a major role in regulating blood glucose 
levels. It stimulates the uptake of glucose in adipose 
tissue and striated muscle for storage as intracellular 
triglycerides and glycogen. Insulin also inhibits the 

15 release of glucose from the liver. Normally, this would 
prevent the rise in blood sugar concentration that occurs 
after eating. However, in the early stages of type 2 
diabetes, resistance to insulin is seen. 

Muscle plays a major role in glucose metabolism. Thus, it 

2 0 also is a major contributor to the development of type 2 
diabetes. In normal situations, muscle cells respond to 
increasing levels of insulin by increasing glucose uptake 
from the bloodstream. However, during the very early stages 
of type 2 diabetes, muscle tissue becomes resistant to 

25 insulin, requiring the pancreatic beta cells to increase 

insulin secretion. Eventually, the beta cells become unable 
to compensate for this increasing insulin resistance from 
muscle and other cells, and insulin production drops. Thus, 
clinical type 2 diabetes results from the combination of 

30 insulin resistance and impaired beta cell function. 

Defects in muscle glycogen synthesis are known to play a 
role in the development of insulin resistance (Petersen and 
Shulman, 2002) . At least three steps - those mediated by 
glycogen synthase, hexokinase, and GLUT 4 - have been 

35 reported to be defective in patients with type 2 diabetes. 

Fatty acids also can induce insulin resistance, and it has 
been suggested that this was a consequence of altered 
insulin signaling through PI3 -kinase. 
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We are utilizing a mouse model of diet-induced obesity 
that progresses to diabetes. The diet is high in fat, an 
increasing component in the U.S. diet, and has been 
documented to lead to diabetes in C57BL/6J mice (Surwit et 
5 al . , 1988). After weaning, C57BL/6J mice were fed either 

the high fat diet or a standard lab chow diet for 16 weeks. 
Body weight was monitored bi-weekly. Fasting glucose and 
insulin levels were measured after 2, 4, 8, and 16 weeks on 
the diets. 

10 Consumption of the HF diet resulted in significant, 

progressive increases in body weight and fasting insulin 
levels in comparison to consumption of the Std diet. 
Fasting glucose levels of mice on the HF diet were 
dramatically increased at the first time point assayed (2 

15 weeks) and remained high through the duration of the 
experiment (16 weeks) . 

At each time point, several diabetic and control mice were 
sacrificed and a number of tissues collected. RNA was 
extracted from the gastrocnemius muscle at each time point. 

2 0 In order to identify additional muscle genes involved in 

the development of type 2 diabetes, we used microarray 
analysis to compare RNA expression levels of 10,000 genes in 
muscle of high fat diet fed and control diet fed mice at 
various time points in the progression of type 2 diabetes. 

2 5 Microarray analysis provides a more global picture of gene 

regulation, allowing the identification of families or 
groups of genes showing similar expression patterns that 
potentially imply similar or coordinated roles in disease 
progression . 

3 0 Consumption of the HF diet resulted in significant, 

progressive increases in body weight and fasting insulin 
levels in comparison to consumption of the Std diet. 
Fasting glucose levels of mice on the HF diet were 
dramatically increased at the first time point assayed (2 
3 5 weeks) and remained high through the duration of the 
experiment (16 weeks) . 

Of 10,000 genes analyzed, 121 were up-regulated but only 7 
down- regulated greater than two-fold in the diabetic 
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relative to non-diabetic mice. These genes are listed in 
Master Table 1 . 

This distribution of up- and down -regulated genes was much 
different from that seen for other organs (liver, pancreas, 
5 and white adipose tissue) where there was a much closer 

balance between the number of up- and down-regulated genes. 
Act in, alpha, cardi ac (Actcl , NM 0096 08) was one of the most 
down- regulated genes when comparing HF to Std mice. It was 
consistently expressed at lower levels in the HF diabetic 
10 mice in comparison to the Std mice and also steadily- 
decreased over the 16 week study. 

Example 2 

Interestingly, further analysis of the time points and 
15 exploration of gene pathways and functionally related genes 
revealed a subset of act in-related and act in-binding genes 
exhibiting a consistent decrease in expression (although 
less than two- fold) in the diabetic mice,- 9 of 37 
functionally related genes were decreased in diabetic muscle 

2 0 at all four time points and an additional 9 were decreased 

at three of the four time points. Only two of these genes 
had been included in the original list of 7 down- regulated 
genes using the two- fold cut-off criterion. 

It is possible that this subtle but coordinated down- 
25 regulation of act in-related or actin-binding genes reflects 
a role in the decreased glucose uptake by skeletal muscle 
that occurs in diabetes. With nearly half (18 of 37) of the 
genes in a related family of genes being consistently down- 
regulated in a study that did not identify a large number of 

3 0 down regulated genes, we feel that actin and genes in act in- 

related pathways may prove to play key roles in muscle as 
obesity and diabetes progress. 

The actin-related and actin-binding mouse genes in 
question have been included at the end of Master Table 1, 
3 5 subtable 1A. 
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Introduction to Master Tables 

The master tables reflect applicants' analysis of the gene 
chip data. 

5 

For each probe corresponding to a differentially expressed 
mouse gene, Master Table 1 identifies 

Col. 1: The mouse gene (upper) and mouse protein (lower) 
10 database accession #s. 

Col. 2: The corresponding mouse Unigene Cluster, as of the 
4 th Quarter 2001 build. 

15 Col. 3: The behavior (differential expression) observed for 
the mouse gene. This column identifies the gene as 
favorable (F) or unfavorable (U) on the basis of its 
strongest differential behavior at the ages tested. There 
are three possible comparisons, HI-D, C-HI, and C-D, where 

2 0 C=control (normal) , HI=hyperinsulinemic , and D=diabetic. 

If HI>D, C>HI, or C>D, the behavior for that subject 
comparison is considered unfavorable. If the inequality is 
reversed, the behavior for that subject comparison is 
considered favorable. 

25 In the Master Table, the numerical value is the ratio of 

the greater value to the lesser value. If this ratio is at 
least two fold, the degree of differential expression is 
considered strong. Usually only mouse genes exhibiting at 
least one strong differential expression behavior are listed 

30 in the Master Table; exceptions are noted in the Examples. 
In Master Table 1, subtables 1A and 2A, the fold 
e xpression values are negative. Likewise, in subtables 1G 
jand 2C, the fold expression values for the favorable, 
^behaviors are negative. This does not have its usual 

3 5 mathematical meaning; it is merely a flag that in at least}, 

one comparison (HI-D, C-HI, and C-D), the former value was[ 
less than the latter one, i.e., the behavior was favorable. 
\For the purpose of a ppl ying the teachings of the, 
s pecification concerning desired ratios, any negative value, 




5 Col. 4: A related human protein, identified by its database 
accession number. Usually, several such proteins are 
identified relative to each mouse gene. These proteins have 
been identified by BLAST searches, as explained in cols. 6- 
8 . 

10 

Col. 5: The name of the related human protein. 

Col. 6: The score (in bits) for the alignment performed by 
the BLAST program. 

15 

Col. 7: The E -value for the alignment performed by the BLAST 
program. It is worth noting that Unigene considers a Blastx 
E Value of less than le-6 to be a "match" to the reference 
sequence of a cluster. 

20 

Unless otherwise indicated, the bit score and E-value for 
the alignment is with respect to the alignment of the mouse 
DNA of col. 1 to the human protein of col. 4 by BlastX, 
according to the default parameters. 

25 

Master Table 1 is divided into three subtables on the basis 
of the behavior in col. 3. If a gene has at least one 
significantly favorable behavior, and no significantly 
unfavorable ones, it is put into Subtable 1A. In the 

3 0 opposite case, it is put into Subtable IB. If its behavior 
is mixed, i.e., at least one significantly favorable and at 
least one significantly unfavorable, it is put into Subtable 
1C. Note that this classification is based on the strongest 
observed differential expression behaviors for each of the 

3 5 three subject comparisons, C-HI, HI-D and C-D. 



The corresponding human gene clusters are also of interest. 
These may be obtained in a number of ways . First , one may 
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search on Unigene 

(http: / /www.ncbi .nlm.nih.gov/entrez/query. f cgi?db=unigene) 
for the identified human protein. Review the "hits" (each 
of which is a Unigene record) for those prefixed by "Hs . " 
5 Secondly, one may access the Unigene record for the mouse 
gene cluster (which is given in Master Table 1) , and then 
click on "Homologene" . This will bring up a new page which 
includes the section "Possible Homologous Genes" . One of 
the entries should be a Homo sapiens gene (considered by 
10 Unigene to be the most related human gene) ; click on its 
Unigene record link. 

Additional information of interest may be accessed by 
searching with the mouse gene accession # in the Mouse Gene 
Informatics database, at http : / /www . informatics . i ax . org/ . 
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| phosphorylase kinase alpha subunit 


| AAH14036 Similar to phosphorylase kinase, alpha 2 (hver) 


|dJ499B10.2 (phosphorylase kinase, alpha 2 (liver) (PYK)) 


phosphorylase kinase alpha subunit liver isoform, PHKA2 {EC 2.7. 1 .38} [human, 
hepatoma, Peptide Partial, 377 aa] 


| phosphorylase kinase (EC 2.7.1.38) beta chain 


| Similar to phosphorylase kinase, beta 


hypothetical protein 


hypothetical protein 
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hypothetical protein 


hypothetical protein FLJ21827 
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retinol-binding protein 4, plasma precursor 


plasma retinol-binding protein precursor 


precursor RBP | 


Plasma retinol-binding protein precursor (PRBP) (RBP) (PR02222) 


Similar to retinol binding protein 4, plasma 


similar to Plasma retinol-binding protein precursor (PRBP) (RBP) (PR02222) 


Retinol Binding Protein j 


Retinol Binding Protein (Holo Form) ~| 


Retinol Binding Protein (Apo Form) jj 


retinol binding protein 1 


E Chain E, The Structure Of Human Retinol Binding Protein With Its Carrier Protein 
Transmyretin Reveals Interaction With The Carboxy Terminus Of Rbp 


F Chain F, The Structure Of Human Retinol Binding Protein With Its Carrier Protein 
Transthyretin Reveals Interaction With 
The Carboxy Terminus Of Rbp 
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|RBP 


AF0396861 G-protein coupled receptor GPR34 


|G protein-coupled receptor 34 


|GP34 HUMAN Probable G protein-coupled receptor GPR34 


| orphan G protein-coupled receptor 


I unnamed protein product 


| AAH20678 G protein-coupled receptor 34 
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CDC37 homolog; CDC37 (cell division cycle 37, S. cerevisiae, homolog); CDC37 (S. 
cerevisiae) homolog 


CC37_HUMAN Hsp90 co-chaperone Cdc37 (Hsp90 chaperone protein kinase-targeting 
subunit) (p50Cdc3 7) 


CDC37 homolog 
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AAH00083 CDC37 (cell division cycle 37, S. cerevisiae. homoloe) I 


AAH08793 GDC37 (cell division cycle 37, S. cerevisiae, homolog) i 
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AF123344 I Kruppel-like zinc finger transcription factor > 


Kruppel-like factor ] 


KLF2 HUMAN Kruppel-like factor 2 (Lung kruppel-like factor) i 


AF205849 1 Kruppel-like factor \ 
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KLF4 HUMAN Kruppel-like factor 4 (Epithelial zinc-finger protein EZF) (Gut- 
enriched Krueppel-like factor) 


|AF105036 1 zinc finger transcription factor GKLF 


| Kruppel-like factor 4 (gut) 


| Kruppel-like factor 4 (gut); endothelial Kruppel-like zinc finger protein 


hEZF 


Similar to Kruppel-like factor 4 (gut) 


Similar to Kruppel-like factor 2 (lung) 


AF401998J muscleblind 4 IkD isoform 


muscleblind (Drosophila)-like 


KIAA0428 " 


MBNL_HTJMAN Muscleblind-like protein (Triplet-expansion RNA-binding protein) 


MBNL protein 


muscleblind-like protein MBLL39 isoform 1 


AF491866 I muscleblind-like protein MLPl 


muscleblind-like protein MBLL39 


CHCR isoform G 


MBXLHUMAN Muscleblind-like X-linked protein (Cys3His CCGl -required protein) 
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CHCR isoform G 
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CHCR protein j 


AF491305 1 MBLX39 | 


muscleblind-like protein MBLL39 isoform 2 j 
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CHCR protein ~1 
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AF395876 1 36 kDa muscleblind protein EXP36 1 
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| Unknown (protein for MGC: 1 5409) 


|AF289608_1 unknown 


Unknown (protein for MGC:32080) 


| CCAAT/enhancer binding protein (C/EBP), beta 


CCAAT/enhancer binding protein (C/EBP), beta; CCAAT/enhancer-binding protein 
(C/EBP), beta (transcription factor-5) 


CEBB HUMAN CCAAT/enhancer binding protein beta (C/EBP beta) (Nuclear factor 
NF-IL6) (Transcription factor 5) 


transcription factor NF-IL6 


nuclear factor NF-IL6 (AA 1-345) 


five-lipoxygenase activating protein (FLAP) 


arachidonate 5-lipoxygenase-activating protein; five-lipoxygenase activating protein; 
MK-886-binding protein 


FLAP HUMAN 5-lipoxygenase activating protein (FLAP) (MK-886-binding protein) 


5-lipoxygenase-activating protein 


5-lipoxygenase activating protein 


lipoxygenase activating protein 


serine (or cysteine) proteinase inhibitor, clade G (Cl inhibitor),member 1 


IC1_HUMAN Plasma protease Cl inhibitor precursor (Cl Inh) (Cllnh) | 


complement Cl inhibitor precursor [validated | 


Cl inhibitor | 


Cl inhibitor | 


AF435921J Cl esterase inhibitor 
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plasma jjrotease (C 1 ) inhibitor precursor jj 
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plasma protease (CI) inhibitor precursor 


CI inhibitor (AA 155-478 ) (1 is 2nd base in codon) 


CI -inhibitor 
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similar to polymeric immunoglobulin receptor 


hepatocellular carcinoma associated protein TB6 


polymeric immunoglobulin receptor 


PIGRHUMAN Polymeric-immunoglobulin receptor precursor (Poly-IG recepl 
(PIGR) [Contains: Secretory component] 


secretory component precursor [validated! 


transmembrane secretory component; poly-Ig receptor; SC 


transmembrane secretory component; SC; poly-Ig receptor 


Polymeric immunoglobulin receptor 


poly-Ig receptor 


glycerol-3 -phosphate dehydrogenase (EC 1.1.99.5), mitochondrial precursor 


glycerol-3 -phosphate dehydrogenase 


glycerol-3 -phosphate dehydrogenase 


glycerol-3 -phosphate dehydrogenase 2 (mitochondrial) 


GPDM HUMAN Glycerol-3-phosphate dehydrogenase, mitochondrial precurs( 
M) (GPDH-M) 


mitochondrial;glycerol-3-phosphate dehydrogenase 


AF3 11325 1 glycerol-3 -phosphate dehydrogenase 3 


glycerol-3 -phosphate dehydrogenase 


AAH 19874 Similar to glycerol-3-phosphate dehydrogenase 2 (mitochondrial) 


similar to Glycerol-3-phosphate dehydrogenase, mitochondrial precursor (GPD- 
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I coagulation factor XIII, Al polypeptide 


high-mobility group box 1; high mobility group box 1; high-mobility group (nonhistone 
chromosomal) protein 1 


|HMG1 HUMAN High mobility group protein 1 (HMG-1) 


| nonhistone chromosomal protein HMG-1 


|HMG-1 protein (AA 1-215) 


| on-histone chromatin protein HMG 1 


|AAH03378 high-mobility group (nonhistone chromosomal') orotein 1 


|high-mobility group (nonhistone chromosomal) protein 1 i 


lHMG-1 


|nonhistone chromosomal protein HMG-1 


|dJ579F20. 1 (high-mobility group (nonhistone chromosomal) protein 1-like 1) 


HMIX HUMAN High mobility group protein 1-like 10 CHMG-ILIO') 
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probable pseudogene; similar to P09429 (PID:gl23369) 
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high-mobility group box 2; high-mobility group (nonhistone chromosomal) protein 2 
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high mobility group 2 protein I 
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|RFX3 HUMAN DNA-binding protein RFX3 


| DNA binding protein RJFX3 


|DNA binding protein RFX3 


RFX1_HUMAN MHC class II regulatory factor RFXl (RFX) (Enhancer factor C) 
(EF-C) 


regulatory factor X 


MHC class II regulatory factor RFX 


regulatory factor XI; trans-acting regulatory factor 1; enhancer factor C; MHC class 
II regnlatoryfactor RFX 


bA32Fl 1.1.2'(regulatory factor X, 3 (influences HLA class n expression), putative 
isoform 2) 
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F-box only protein 32 isoform 1; muscle atrophy F-box protein; atrogin-1 


FX32HUMAN F-box only protein 32 (Muscle atrophy F-box protein) (MAFbx) 
(Atrogin-1) 
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F-box domain Fbx25-containing protein | 
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AAH15642 Similar to serine (or cysteine) proteinase inhibitor, clade A (alpha-1 
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A1AT_HUMAN Alpha-l-antitrypsin precursor (Alpha-1 protease inhibitor) 
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alpha-l-antitrypsin 1 


Chain A, A 2. 1 Angstrom Structure Of An Uncleaved Alpha-1- Antitrypsin Shows 
Variability' Of The Reactive Center And Other Loops 
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Similar to solute carrier family 12 (sodium/potassium/chloride transporters), member 


sodium potassium chloride cotransporter 2; Solute carrier family 12 
(sodium/potassium/chloride transporters), 
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sodium-(potassium)-chloride cotransporter 2) (Kidney-specific Na-K-Cl symporter) 


bumetanide-sensitive Na-K-2C1 cotransporter 
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cation-chloride cotransporter-interacting protein 1 


serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 6; protease 
inhibitor 6 (placental thrombin inhibitor) 
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(CAP)(Protease inhibitor 6) (PI-6) 


cytoplasmic antiproteinase; CAP I 
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8)(SerpinB8) 
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squamous cell carcinoma antigen 1 


serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3; squamous 
cell carcinoma antigen 1 


SCC1 HUMAN Squamous cell carcinoma antigen 1 (SCCA-1) (Protein T4-A) 


squamous cell carcinoma antigen 


squamous cell carcinoma antigen 1 


AAH05224 serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3 


squamous cell carcinoma antigen; SCC antigen 


voltage-dependent calcium channel gamma-4 subunit; neuronal voltage-gated calcium 
channel gamma-4 subunit 


CCG4 HUMAN Voltage-dependent calcium channel gamma-4 subunit (Neuronal 
voltage-gated'.calcium channel gamma-4 subunit) 


calcium channel gamma 4 subunit 


AF 162692 1 putative voltage-gated calcium channel gamma-4 subunit 


calcium channel, voltage-dependent, gamma subunit 4 


voltage-dependent calcium channel gamma-2 subunit; stargazin; neuronal 
voltage-gated calcium channel gamma-2 subunit 


CCG2 HUMAN Voltage-dependent calcium channel gamma-2 subunit (Neuronal 
voltage-gated calcium channel gamma-2 subunit) 1 


AF096322 1 neuronal voltage-gated calcium channel gamma-2 subunit 


AF36 1 354 1 voltage-dependent calcium channel Ramma-8 subunit 


voltage-dependent calcium channel gamma-8 subunit; neuronal voltage-gated calcium 
channel gamma-8 subunit 


CCG8 HUMAN Voltage-dependent calcium channel gamma-8 subunit (Neuronal 
voltage-gated calcium channel gamma-8 subunit) 


AF288388 1 calcium channel gamma subunit 8 


voltage-dependent calcium channel gamma-3 subunit; neuronal voltage-gated calcium 
channel gamma-3 subunit 


CCG3^HUMAN Voltage-dependent calcium channel gamma-3 subunit (Neuronal 
voltage-gated calcium channel gamma-3 subunit) 
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Unknown gene product 


AF100346 1 neuronal voltage gated calcium channel gamma-3 subunit 


AF134640 1 calcium channel gamma subunit 3 


calcium channel, voltage-dependent, gamma subunit 3 


similar to calcium channel gamma subunit 8 
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TCI 7 HUMAN Transcription factor 17 (Zinc finger protein eZNF) 
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zinc finger protein 184 (Kruppel-like) 


Unknown (protein for MGC:29879) 


kruppel-related zinc finger protein 


similar to Zinc finger protein 184 


Zl 84 HUMAN Zinc finger protein 1 84 


b34I8.1 (zinc finger protein 184 (Kruppel-like)) 


similar to EZFIT-related protein 1 


AF352026 1 EZFIT-related protein 1 


hypotheticalprotein 


similar to zinc finger protein 91 (HPF7, HTF10) 


Similar to zinc finger protein 208 


protocadherin 7, isoform a precursor; BH-pcdh; BH-protocadherin (brain-heart); 
brain-heart protocadherin 


PCH7_HUMAN Protocadherin 7 precursor (Brain-heart protocadherin) (BH-Pcdh) 


PCDH7 (BH-Pcdh)a 


protocadherm 7, isofonn b precursor; BH-pcdh; brain-heart protocadherin; 
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PCDH7 (BH-Pcdh)c 
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similar to 5-hydroxyrryptamine 1 A receptor (5-HT-l A) (Serotonin receptor) 
(5-HTlA)(G-21) 


5H1A_HUMAN 5-hydroxytiyptarnine 1A receptor (5-HT-lA) (Serotonin receptor) 
(5-HT1A) (G-21) V ' 


1 serotonin receptor 1A 


1 serotonin 5-HTla receptor 


1 serotonin receptor 


serotonin receptor 1A 


AF498978 1 5-hydroxytryptarnine receptor 1A 
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5H1B HUMAN 5-hydroxytrypternine IB receptor (5-HT-lB) (Serotonin 
receptor)(5-HT-lD-beta) (Serotonin ID beta receptor) (S12) 


serotonin receptor IB 


serotonin IDb receptor 


serotonin receptor 
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dJ50 1 M23 . 1 (5-hydroxytryptarnine (serotonin) receptor IB) 1 


5-hyckoxytryptarnine (serotonin) receptor IB 1 


serotonin receptor:ISOTYPE= 1 D-beta I 


5-hydroxyhyptarnine (serotonin) receptor 1A 1 


receptor protein (AA 1 - 42 1 ) | 


guanine nucleotiderbinding regulatory protein-coupled recepto 1 


G protein coupled receptor 


XP 003692.2 
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CAA40962.1 


AAA66493.1 


BAA94488.1 


AAM21125.1 


NP 000854.1 


P28222 


JN0268 
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AAA36029.1 


BAA01763.1 
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sialyltransferase 8D (alpha-2, 8-polysialytransferase); Polysialyltransferase; 
sialyltransferase 8 (alpha-2, 8-polysialytransferase) D 


SI8DHUMAN CMP-N-acerylneuraminate-poly-alpha-2,8-sialyl transferase 
(Alpha-2,8-sialyltransferase 8D) (ST8Sia IV) (Polysialyltransferase-1) 


alpha-2 , 8-polysialyltransferase 


alpha-2,8-polysialyltransferase 


polysialyltransferase 


sialyltransferase 8B (alpha-2, 8-sialytransferase); Sialyltransferase X; sialyltransferase 
8 (alpha-2, 8-sialytransferase) B 


SI8B_HUMAN Alpha-2,8-sialyltransferase 8B (ST8Sia II) (Sialyltransferase X)(STX) 


sialyltransferase 


sialyltransferase 


sialyltransferase X 


sialyltransferase STX 


STX protein 


sialyltransferase 


Similar to sialyltransferase 8D (alpha-2, 8-polysialytransferase) 


alpha-2,8-sialyltransferase III 


sialyltransferase 8C (alpha2,3Galbetal,4GlcNAcalpha 2,8-sialyltransferase); 
alpha-2,8-sialyltransferase III 


SI8C_HUMAN Sia-alpha-2,3-Gal-beta-l,4-GlcNAc-R:alpha 2,8-sialyltransferase 
(Alpha-2,8-sialyltransferase 8C) (ST8Sia III) 


Sia alpha2,3Galbetal,4GlcNAcalpha 2,8-sialyltransferase 


wingless-type MMTV integration site family, member 2B, isoform WNT-2B2; 
wingless-type MMTV integration site family, member 13; XWNT2, Xenopus, 
homolog of 


WN2B HUMAN WNT-2B protein precursor (WNT-13) 


NP_005659.1 
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NP 056963.1 
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WNT-2B Isoform 2 


wingless-type MMTV integration site family, member 2B, isoform WNT-2B1; 
wingless-type MMTV integration site family, member 13; XWNT2, Xenopus, 
homolog of 


WNT-2B Isoform 1 


secreted glycoprotein Wnt- 1 3 


Wnt-13 


wingless-type MMTV integration site family member 2 precursor; int-1 related 
protein; oncogene INTl-like 1; secreted growth factor 


WNT2 HUMAN WNT-2 protein precursor (IRP protein) (Int-1 related protein) 


int-1 -like protein 1 precursor 


Irp protein (AA 1-360) 


wingless-type MMTV integration site family member 2 


secreted growth factor 


wingless-type MMTV integration site family, member 5A precursor; proto-oncogene 
Wnt-5A precursor; WNT-5A protein precursor 


WN5 A HUMAN WNT-5 A protein precursor j 


proto-oncogene Wnt-5A precursor 


hWNT5A 


wingless-type MMTV integration site family, member 5B precursor; WNT-5B protein 
precursor 


wingless-type MMTV integration site family, member 5B precursor; 




WN5B HUMAN WNT-5B protein precursor 


AAHO 1749 Similar to wingless-related MMTV integration site 5B 


WNT5B 


wingless-type MMTV integration site family, member 7B precursor 


BABl 1985.1 


NP 004176.2 


BABl 1984.1 


T09612 
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P41221 
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| WN7B HUMAN WNT-7B protein precursor 


WNT7B 


Iwingless-type MMTV integration site family, member 7B 


|AF416743_1 WNT7B 


wingless-type MMTV integration site family, member 7A precursor; proto-oncogene 
Wnt7a protein 


Unknown (protein for MGC: 10346) 


WNT5b precursor 


CCR4-NOT transcription complex, subunit 2; NOT2 (negative regulator of 
transcription 2, yeast) homolog 
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CCR4-NOT transcription complex, subunit 2 


Similar to CCR4-NOT transcription complex, subunit 2 


unnamed protein product 


AF161480J HSPC131 


AF1 13226 1 MSTP046 1 


hypothetical protein DKFZp434M0572.1 


hypothetical protein 


a disintegrin and metalloprotease domain 11, isoform 1 preproprotein; 
metalloproteinase-like, disintegrin-like, cysteine-rich protein 


MDC/ADAMll 


ADl 1_HUMAN ADAM 11 precursor (A disintegrin and metalloproteinase domain 
1 1) (Metalloproteinase-like, disintegrin-like, and cysteine-rich protein) fMDCi 


disintegrin-like metalloproteinase (EC 3.4.24.-), splice form 2 


P56706 


BAB68399.1 


AAH34923.1 


AAN32640.1 


NP 004616.2 


AAH08811.1 


AAG3 8659.1 


NP 055330.1 
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metalloprotease/disintegrin-like protein 


a disintegrin and metalloprotease domain 11, isoform 2 preproprotein; 
metalloproteinase-like, disintegrin-like, cysteine-rich protein 


disintegrin-like metalloproteinase (EC 3.4.24.-), splice form 1 


MDC=metalloprotease/disintegrin-like cysteine-rich protein [human, cerebellum, 
Peptide, 524 aa] 


MDC protein 


metalloprotease/disintegrin-like protein 


a disintegrin and metalloproteinase domain 22 isoform 5 proprotein; MDC2 delta 


MDC2 beta 


AF073291 1 MDC2 


a disintegrin and metalloproteinase domain 22 isoform 3 proprotein; MDC2 delta 


a disintegrin and metalloproteinase domain 22 isoform 2 proprotein; MDC2 delta 


calcyon 


D1IP_HUMAN D 1 dopamine receptor-interacting protein calcyon 


AF225903_1 D l dopamine receptor interacting protein calcyon 


Similar to calcyon; Dl dopamine receptor-interacting protein 


NTCI HUMAN Neurogenic locus notch homolog protein 1 precursor (Notch 1) 
(hNl) (Translocation-associated notch protein TAN-1) 


AF308602_1 NOTCH 1 


notch protein homolog TAN-1 precursor 


TANl 


notch 2 preproprotein 


AF3 15356 1 NOTCH2 protein 
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MYB proto-oncogene protein 


transforming protein myb, splice form containing exon 9A 


alternatively spliced product using exon 9B 


(alternatively spliced product using exon 8A 


GLK5 HUMAN Glutamate receptor, ionotropic kainate 5 precursor (Glutamate 
receptor KA-2) (KA2) (Excitatory amino acid receptor 2) (EAA2) 


glutamate receptor subunit 


glutamate receptor subunit; EAA2; excitatory amino acid receptor 2 


glutamate receptor, ionotropic, kainate 5 


kainate receptor subunit KA2a 


glutamate receptor, ionotropic, kainate 4; excitatory amino acid receptor 1 


GLK4 HUMAN Glutamate receptor, ionotropic kainate 4 precursor (Glutamate 
receptor KA-1) (KAl) (Excitatory amino acid receptor 1)(EAA1) 


glutamate ionotropic receptor EAAl chain precursor 


excitatory amino acid receptor 1; kainate receptor subunit EAAl 


glutamate receptor 6 kainate-preferring precursor 


GluR6 kainate receptor=ionotropic-type glutamate receptor 


glutamate receptor, ionotropic, kainate 2 


GLK2 HUMAN Glutamate receptor, ionotropic kainate 2 precursor (Glutamate 
receptor 6) (GluR-6) (GluR6) (Excitatory amino acid receptor 4) (EAA4) 


EAA4 1 


GluR6 kainate receptor 


kainate receptor subunit j 


GLK3 HUMAN Glutamate receptor, ionotropic kainate 3 precursor (Glutamate 
receptor 7) (GluR-7) (GluR7) (Excitatory amino acid receptor 5) (EAA5) 


glutamate receptor, ionotropic, kainate 3 


AAC96326.1 


TVHUMB 


AAB49035.1 


AAB49036.1 


Q16478 


157936 


AAB22591.1 


NP_002079.2 


CAC80547.1 


NP 055434.1 


Q 16099 


JH0826 


AAB29311.1 


A54260 


AAB3 1362.1 


NP 068775.1 


Q13002 


AAC50420.1 
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Q 13003 


NP 000822.1 
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collapsin response mediator protein 1; collapsin response mediator protein 1 
(dihydropyrimidinase-like 1) 


DPYl HUMAN Dmydropyrimidinase related protein- 1 (DRP-1) (Collapsin response 
mediator protein 1) (CRMP-1) 


dmydropyrirnidinase related protein 1 


|dmydropyrimidinase related protein-1 


collapsin response mediator protein 1 


collapsin response mediator protein 1 


collapsin response mediator protein 1 


hCRMP-1 


dihydropyrimidinase-like 2; collapsin response mediator protein hCRMP-2 


DPY2HUMAN Dmydropyrimidinase related protein-2 (DRP-2) (Collapsin response 
mediator protein 2) (CRMP-2) (N2A3) 


dihydropyrimidinase-related protein 2 


hCRMP-2 ■ 


dihydropyrimidinase related protein-2 


N2A3 


dihydropyrimidinase related protein 2 [ 


dihydropyrimidinase-like 3 


DPY3_HUMAN D%ckopyrirnidinase related protein-3 (DRP-3) (Unc-33-like 
phosphoprotein) (ULIP protein) (Collapsin response mediator protein 4) (CRMP-4) 


dmydropyrimidinase related protein 3 j 


dmydropyrimidinase related protein-3 


dmydropyrirnidinase-like 3 


ULIP 


AAB60407.1 


AAA95961.1 


NP 001304.1 
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JC5316 


BAAl 1190.1 
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NP 001377.1 


Q16555 


JC5317 


AAA93202.1 


BAA11191.1 


AAC05793.1 
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protein Z, vitamin K-dependent plasma glycoprotein 


|PRTZ HUMAN Vitamin K-dependent protein Z precursor 


protein Z 


protein Z 


AF440358 1 protein Z, vitamin K-dependent plasma glycoprotein 


plasma protein Z precursor 


protein Z 


protein Z spliced variant 


protein Z 


coagulation factor X precursor 


coagulation factor X 


factor X prepeptide 


coagulation factor X precursor; Prothrombinase 


FAIO HUMAN Coagulation factor X precursor (Stuart factor) 


coagulation factor Xa (EC 3.4.21 .6) precursor 


coagulation factor X 


coagulation factor X 


AF503510_l;CoaguIation factor X 


F9 (coagulation factor DC (plasma thromboplastic component, Christmas 
disease,haemopbilia B)) 


coagulation factor DC; Coagulation factor DC (plasma thromboplastic component); 
Factor 9; Factor IX; Christmas factor 


coagulation factor DC precursor 


factor IX (Christmas factor) precursor 


coagulation factor IX (plasma thromboplastic component, Christmas disease, 
hemophilia B) 


FA9 HUMAN Coagulation factor DC precursor (Christmas factor) | 


coagulation factor DCa (EC 3.4.2 1 .22) precursor 


NP_003882.1 


|P22891 


|aAA36500.1 


IBAA85763.1 


|aAL27631.1 


KXHUZ 
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AAA36499.1 


AAA51984.1 


1205236A 


AAA52490.1 


NP 000495.1 


P00742 


EXHU 
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AAA52764.1 


AAM19347.1 


CAA21954.1 


NP_000 124.1 


AAA52023.1 | 
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PDK4HUMAN [Pyravate dehydrogenase [lipoamide]] kinase isozyme 4, 
mitochondrial precursor (Pyruvate dehydrogenase kinase isoform 4) 


pyruvate dehydrogenase kinase isoform 4 


pyruvate dehydrogenase kinase isoform 4 


pyruvate dehydrogenase kinase isoform 4 


pyruvate dehydrogenase kinase, isoenzyme 4 


pyruvate dehydrogenase kinase, isoenzyme 1 


PDK1HUMAN [Pyruvate dehydrogenase [lipoamide]] kinase isozyme 1, 
mitochondrial precursor (Pyruvate dehydrogenase kinase isoform 1) 
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mu opioid receptor 


opioid receptor, mu 1 


|Mu opiate receptor 


opioid receptor mu variant MORI A 


mu opioid receptor variant 


DRG kappa 1 splice variant KOR 1 A 


OPRD_HUMAN Delta-type opioid receptor (DOR-1) 


delta opiate receptor 
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sialidase 2; cytosolic sialidase; N-acetyl-alpha-neuraminidase 2; neuraminidase 2 


NER2 HUMAN Sialidase 2 (Cytosolic sialidase) (N-acetyl-alpha-neuraminidase 2) 


neuraminidase; sialidase 


sialidase 3; neuraminidase 3; ganglioside sialidase; N-acetyl-alpha-neuraminidase 3 


Nuraminidase 


NER3_HUMAN Sialidase 3 (Membrane sialidase) (Ganglioside sialidase) 
(N-aceryl-alpha-neurarninidase 3) 


ganglioside sialidase 


sialidase 


sialidase 


similar to PYRIN-containing APAFl-like protein 4; PAAD and NACHT-containing 
protein 2; ribonuclease inhibitor 2 


PYPJN-containing APAFl-like protein 4; PAAD and NACHT-containing protein 2; 
ribonuclease inhibitor 2 


NAL4_HUMAN NACHT-, LRR- and PYD-containing protein 4 (PAAD and 
NACHT-containing protein 2) (PYRIN-containing APAFl-like protein 4) 
(Ribonuclease inhibitor 2) 
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AF479747 1 PYRIN-containing APAFl-like protein 4 


unnamed protein product 


AF482706 1 ribonuclease inhibitor 2 


similar to PYRIN-containing APAFl-like Protein 7 


PYRIN-containing APAFl-like protein 6 


PYA6HUMAN PYRIN-containing APAFl-like protein 6 


PYRIN-containing APAFl-like protein 6 


PYRIN-containing APAF 1 -like protein 6 
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cold autoinflammatory syndrome 1 ; chromosome 1 open reading frame 7; 
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APAFl-like protein 1 


CISI HUMAN Cold autoinflammatory syndrome 1 protein (Cryopyrin) (NACHT-, 
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(Angiotensin/vasopressin receptor AII/A VP-like) 


! AF410477_1 cryopyrin 


cryopyrin 
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similar to PYRIN-containing APAFl-like protein 4; PAAD and NACHT-containing 
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SIM2, homolog of the Drosophila single-minded gene SIMl 


SIM2HUMAN Smgle-minded homolog 2 


transcription factor SIM2 long form 
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smgle-minded (Drosophila) homolog 2 short isoform; human transcription factor 
SIM2, homolog of the Drosophila single-minded gene SIMl 


transcription factor SIM2 short form 
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hypothetical protein FLJ23188 
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unnamed protein product 
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carrier family 12 (sodium/potassium/chloride transporters) 


S 1 22 HUMAN Solute carrier family 1 2 member 2 (Bumetanide-sensitive 1 


bumetanide-sensitive Na-K-Cl cotransporter ( 


bumetanide-sensitive Na-K-Cl cotransporter J 
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S123HUMAN Solute carrier family 12 member 3 (Thiazide-sensitive 
sodium-chloride 


solute carrier family 12 (sodium/chloride transporters), member 3; Solute carrier 
family 12 (sodium/potassium/chloride transporters), 


thiazide-sensitive Na-Cl 


NaCl electroneutral Thiazide-sensitive cotransporter 


NaCl electroneutral Thiazide-sensitive cotransporter 


gamma-aminobutyric acid (GABA) A receptor, gamma 3 


GABAA receptor gamma 3 subunit 


GAC3_HUMAN Gamma-aminobutyric-acid receptor gamma-3 subunit precursor 
(GABA(A) receptor) 


GABAA receptor gamma 3 subunit 


GABAA receptor gamma 3 subunit 


gamma-aminobutyric acid A receptor gamma 2 


gamnia-arriinobutyric acid A receptor, gamma 2 precursor 
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(GABA(A) receptor) 


garrrnia-arninobutyric acid/benzodiazepine receptor gamma-2 chain precursor 


GABA-A receptor gamma 2 subunit 


GABAa receptor gamma2 
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PRECURSOR (GABA(A) RECEPTOR) 
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receptor) [Homo sapiens] 


gamma-aminobutyric acid (GABA) A receptor, epsilon, isoform 1 precursor 
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G protein-coupled receptor 85; super conserved receptor expressed in brain 2 


GP85 HUMAN Probable G protein-coupled receptor GPR85 (Super conserved 
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seven transmembrane helix receptor 


super conserved receptor expressed in brain 3 


SRB3 HUMAN Super conserved receptor expressed in brain 3 


G-protein coupled receptor, SREB3 
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AAH09861 super conserved receptor expressed in brain 3 


carbonic anhydrase VB, mitochondrial precursor; carbonic dehydratase 
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carbonic anhydrase VB 


carbonic anhydrase VB, mitochondrial 
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mitochondrial; carbonic dehydratase 
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Thr (A65t) 
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A Chain A, Carbonic Anhydrase li Complexed With , 
4-(Arnmosulfonyl)-N-[(2,3,4,5,6-Pentafluorophenyl)memyll-Benzamide 
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Carbonic Anhydrase li (E.C.4.2.1.1) (Native Zinc Replaced By Cobalt) Complex With 
Bicarbonate 


| Carbonic Anhydrase li (E.C.4.2.1.1) With Zinc Replaced By Copperdi) 


|Carbonic Anhydrase li (E.C.4.2.1.1) Complex With Trifluoromethane Sulphonamide 


Carbonic Anhydrase li (E.C.4.2.1.1) Complex With Bromide 


Carbonic Anhydrase li (E.C.4.2.1.1) With Zinc Replaced By Cobalt(Ii) 
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A Chain A, Huinan Carbonic Anhydrase li Complexed With Urea 


Human Carbonic Anhydrase li Complexed With The Histamine Activator 


A Chain A, Site-Specific Mutant (Tyr7 Replaced With His) Of Human Carbonic 
Anhydrase li 
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A Chain A, Crystal Structure Of Human Carbonic Anhydrase li Complexed With An 
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A Chain A, Crystal Structure Analysis Of The Hca li Mutant T199p In Complex With 
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A Chain A, Crystal Structure Analysis Of Hca li Mutant T199p In Complex With 
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A Chain A, Crystal Structure Analysis Of Hca li Mutant T199p In Complex With 
Bicarbonate 


phenylethanolamine N-methyltransferase 


PNMT_HUMAN Phenylethanolanjine N-methyltransferase (PNMTase) 
(Noradrenaline N-methyltransferase) 


phenylemanolamine N-methyltransferase (EC 2.1.1.28) 


B Chain B, Crystal Structure Of Human Pnmt Complexed With Sk&f 29661 And 
Adohcy(Sah) 


A Chain A, Crystal Structure Of Human Pnmt Complexed With Sk&f 29661 And 
Adohcy(Sah) 
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AF498965J small GTP binding protein RAC2 


ras-related C3 botulinum toxin substrate 1 isoform Racl; rho family, small GTP 
binding protein Racl 


RAC1HUMAN Ras-related C3 botulinum toxin substrate 1 (p21-Racl) (Ras-like 
protein TC25) 


GTP-binding protein racl 


|d Chain D, Crystal Structure Analysis Of Racl-Gdp Complexed With Arfaptin (P21) 


D Chain D, Crystal Structure Analysis Of Racl-Gdp In Complex With Arfaptin (P41) 


ras-related C3 botulinum toxin substrate 


racl p21=small GTP-binding protein [human, HL60, Peptide, 192 aal 


Racl protein 


AF4989641 small GTP binding protein RACl 


AAH04247 ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding 
protein Racl) 


small G protein 


ras-like protein 


D Chain D, Crystal Structure Analysis Of Racl-Gmppnp In Complex With Arfaptin 


A Chain A, Structure Of The RacP67PHOX COMPLEX 


A Chain A, Racl-Rhogdi Complex Involved In Nadph Oxidase Activation 


B Chain B, Racl-Rhogdi Complex Involved In Nadph Oxidase Activation 


ras-related C3 botulinum toxin substrate 3 (rho family, small GTP binding protein 
Rac3); rho family, small GTP binding protein Rac3 


RAC3 HUMAN Ras-related C3 botulinum toxin substrate 3 (p21-Rac3) I 
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AAH15 197 ras-related C3 botulinum toxin substrate 3 (rho family, small GTP bmding 
protein Rac3) . 


AAH09605 ras-related C3 botulinum toxin substrate 3 (rho family, small GTP binding 
protein Rac3) 


AF498966 1 small GTP binding protein RAC3 I 
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alphal-antichymotrypsin 


similar to Alpha- l-antichymotrypsin precursor (ACT) 


AACT_HUMAN Alpha- 1-antichymotrypsin precursor (ACT) 


serine (or cysteine) proteinase inhibitor, clade A (alpha- 1 antiproteinase, antitrypsin), 
member 3 


Unknown (protein for MGQ18102) 


serine (or cysteine) proteinase inhibitor, clade A (alpha- 1 antiproteinase, antitrypsin), 
member 3 


alpha- 1 -antichymotrypsin precursor 


alpha- 1 -antichymotrypsin precursor 


alpha- 1 -antichymotrypsin precursor \ 


A Chain A, Alphal-Antichymotrypsin Serpin In The Delta Conformation (Partial 
Loop Insertion) 


chymotrypsin inhibitor 


alpha- 1-antichymotrypsin, precursor; alpha- 1 -antichymotrypsin; antichymotrypsin 


alpha- 1 -antichymotrypsin 


A Chain A, Alphal Antichymotrypsin 


AAH07920 Unknown (protein for MGC: 14 1 1 1) 


L76133 1 lymphocyte antigen 
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MHC class II antigen 


MHC class II histocompatibility antigen DR beta 1 chain precursor 


precursor 






CAA48671.1 


XP 028322.1 


POlOll 


AAH03559.1 


AAH10530.1 


AAH34554.1 


AAD08810.1 


ITHUC 


AAA51560.1 


IQMN 


1313184C 


NP 001076.1 


AAA51543.1 


2ACH 


AAH07920.1 


AAL40069.1 


AAH08403.1 


CAC08827.1 


154448 


AAA59713.1 


U:(C-D) 
2.65 




U:(C-D) 
2.59 




























U:(C-D) 
2.59 












Mm. 1011 
6 




Mm.14191 




























Mm.2256 
4 












NM 018866 
NP 061354.1 




NM_008458 
NP 032484.1 




























NM 010382 
NP 034512.1 













c- 
o 



VD 

oo 
co 



s 



oo 

CO 



O 



oo 
co 



r- 
o 



m 
oo 
co 



o 



oo 

CO 



m 
oo 
co 



o 



in 
oo 

CO 



m 
oo 
co 



o 



m 
oo 
co 



r- 
o 



m 
oo 
co 



.a 



JO 



Q 

u 
oo 

•a 



1 



o 
o 
o 



.a 

ea 
M 

o 

m 

■ 

< 



OO 

•■c 



o 
c_> 
o 



(0 

u 
3 



00 

CO 



1 



Oh 

2 

Q 

.a 

C9 



u 

.O 

Q 

.-J 
S3 
c 

cu 
oo 



3 

X) 

OS 
Q 



.9 

CO 

.a 

u 

3 



Q 
c 

CD 
00 



o 



in 
oo 
co 



Q 

.a 

a 



Q 

I 

< 



a 

00 



a 

o 

oo 

•a 



73 
O 



ea 



o 

o 
o 



co 

O 
co 
vo 
oo 

53 



3 



=5. L) 

o, a, 



u 
cn 



co 

s 

.g 

o 
a. 



co 

CO 

<N 



00 ' 



CO 

co 
cn 



co 
co 



O 

cn 
<N 



.a 
s 

< 
< 

t/5 



oo 
O 
On 
t> 
co 
CN 



co 

CN 
OO 

oo 

8 

6 



CN 
CO 

>n 



CN 
VO 
CN 

oo 
oo 

o 

o 
< 
a 



o 
so 

r-» 

co 



CO 

on 



o 

so 

CO 

on 

U 
< 
U 



on 
m 

CO 

m 
r- 

m 
< 



P-, 



ON 

m 

o 
o\ 
I© 



o 

oo 

O 
3 



Q 

6 vi 

X in 
£3 cn 



Q 

-cr 

D CN 



oo 

"St 

a> 



— <n 

oo & 

>n vS 

CN CN 



Z ft 



o\ in 
o oo 

cn oo 



CN 
CO 
CN 



O 
VO 



CN 
CN 



r- 



CN 
CN 
CN 



in 



oo 
o 

CN 



CI 

vo 



oo 
o 
cn 



r- 
o 
cn 



CO 
VO 



O 
OS 

oo 



vO 
CN 



I 



o 

! 



vo 
vo 
oo 

VO 

oo 



CO 

.a 
s 

ft 



O 
O 

vo 
co 



a 



aa ' 



S ft 



CN 
co 



2 
•H 
S 



cn 

CN 
CN 

VO 

>o 



VO 
CN 
VO 

a 
3 



5 



< 

VO 



© 

co 
o\ 
•* 
vo 

CO 

ft 
< 



m 
co 

o 

3 



ft 

M 
i 

O 

CN 

8 



o 

CN 



© 

CN 

o\ 
r~ 

CN 
O 

3 



ft 



co 
o 
vo 
o 

VO 



CO 
CN 
CN 
VO 
VO 



O vn 

!D CN 



o 

VO 



oo 
CN 
CN 



u 



o 
u 
co 
t- 1 
H 



13 
ft 



60 

■a 



P 

co 
Q 
O 



co 
CN 
r— 
o 
o 
o 



a 

6 ON 

en 

D CN 



© 

vo 



oo 

CN 
CN 



© 

vo 



oo 

CN 
CN 



© 
VO 



OO 
CN 
CN 



© 
VO 



OO 
CN 
CN 



© 
VO 



OO 
CN 
CN 



O 

i 
o 

ft 

.a 

M 

u 
•O 

m 

8 
.a 



u 
ft 

.a 
I 
M 

co 
Q 
U 

.a 

s 
§■ 

s. 

oo 

u 

o 

1 



CM 

o 
ft 



ft 

.a 

i 

o 

"oo 
co 
H 

© 

CN 



CO 
OO 
VO 
u-i 
CN 



2 



.T3 

a 
a> 
oo 

1 

co 
H 



.a 

o - 

M 
ft 

a 



co 
H 



r~ 
vo 
r-~ 

CN 

§ 



CO 

oo 

CO 

E 



CN 



r- vo 
oo ,— . 

vo 
co oo 
S 

°l° 




2e-58 


o 


o 


o 


o 


o 


o 






2e-75 


2e-75 


■ 

u 
CN 


8e-62 




t — } 


f — ) 


t — > 


f— i 


OS 

• 

u 

CN 


le-90 


le-90 


le-79 


CN 
CN 
CN 


1150 


1150 


1141 


1950 


1950 


1949 


CN 

o\ 

OO 


CN 

o\ 

OO 


cn 
oo 

CN 


cn 
oo 
CN 


cn 
oo 

CN 


oo 

CO 
CM 


VO 

r-- 


1/1 

r- 


< 

in 
r~ 


io 
r~- 


r- 


O 
T 

m 


»— < 
m 
cn 


c«n 
m 


in 

OS 
CN 


protein delta T3,glyco 


KIAA07 1 0 gene product 


KIAA0710 protein 


KIAA0710 gene product 


CUT2HUMAN Homeobox protein Cux-2 (Cut-like 2) 


The human homolog of mouse Cux-2 


similar to Homeobox protein Cux-2 (Cut-like 2) 


CUT1HUMAN CCAAT displacement protein (CDP) (Cut-like 1) 


CCAAT displacement protein, CDP [human, Peptide, 1505 aa] 


cut-like 1, CCAAT displacement protein; cut like 1, CCAAT displacement protein 
(Drosophila) 


alternatively spliced 


cut-like 1, CCAAT displacement protein (Drosophila) 


AF271236J transcription factor CUX2 


hypothetical protein 


diacylglycerol O-acyltransferase homolog 2; GS1999full 


AAH15234 Unknown (protein for MGC:17861) 


AF384161_1 diacylglycerol acyltransferase 2 


product is unknown 


bA351K23.5 (novel protein) 


diacylglycerol O-acyltransferase 2 like 1; diacylglycerol acyltransferase 2-like 


AF384163 1 diacylglycerol acyltransferase 2-like protein 


AC004876 5 similar to predicted proteins AAB54240 (PID:g2088822) and S67138 
(PID:g2132925) 


1101394A 


NP 055686.1 


BAA31685.1 


AAH24043.1 


014529 


BAA22962.2 


XP_027045.6 


P39880 


AAB26579.1 


NP 001904.1 


AAA35654.1 


AAH25422.1 


AAG59620.1 


CAD38961.1 


NP 115953.2 


AAH15234.1 


AAK84 176.2 


BAB40641.2 


CAD13492.1 


NP 477513.1 


AAK84178.1 


AAD45832.1 




U:(C-D) 
2.27 






U:(C-D) 
2.26 


















U:(C-D) 
2.26 




















Mm.32580 






VC 
«- 




















Mm. 1801 
89 




















AK004773 
XP_125911.2 






NM_007804 
NP_03 1830.1 


















NM 026384 
NP 080660.1 



















le-66 


5e-57 


le-55 


5e-53 


CO 
V~l 
1 

tu 
in 


o 


o 


o 


o 


o 


o 


5e-77 


5e-77 


5e-77 


5e-77 


VO 

r~ 
i 

u 

CN 


2e-76 


2e-76 


2e-68 


7e-51 


le-50 


e-108 


>n 

CM 


OS 

cs 


m 

CN 


© 

CN 


VO 

o 

CN 


1081 


VO 
m 
o\ 


VO 

m 
o 


VO 

>n 
o\ 


■<r 
m 

ON 


-cl- 
in 
o\ 


VO 
OO 
CN 


VO 

oo 
cs 


VO 
oo 
CN 


VO 
oo 
CN 


■d- 

oo 

CN 


oo 

CN 


oo 
cs 


oo 
m 
cs 


o 
o 

CS 


OS 
OS 


oo 
oo 

CI 


similar to bA351K23.5 (novel protein) 


similar to bA351K23.5 (novel protein) 


similar to bA351K23.5 (novel protein) 


hypothetical protein FLJ22644 


unnamed protein product 


ezrin-binding protein PACE-l 


hypothetical protein 


dJ97P20.1 (novel gene) 


ezrin-binding partner PACE-l 


ezrin-binding partner PACE-l 


Similar to hypothetical protein LOC57147 


similar to P-selectin glycoprotein ligand 1 precursor (PSGL-1) (Selectin P ligand) 
(CD162 antigen) 


SEPL HUMAN P-selectin glycoprotein ligand 1 precursor (PSGL-1) (Selectin P 
ligand) (CD 162 antigen) 


P-selectin glycoprotein ligand PSGL-1 precursor, long splice form 


P-selectin glycoprotein ligand 


selectin P ligand 


ligand for P-selectin 


selectin P ligand 


unnamed protein product 


apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3F; similar to 
Phorbolin 3 (APOBECl-like) 


apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3F 


KIAA1543 protein 


XP 088691.1 


XP 088683.1 


XP 093119.2 


NP 079374.1 


BAB 15436.1 


AAN41656.1 


CAB55300.1 


CAB52564.2 


AAN23123.1 


NP 065156.4 


AAH14662.1 


XP 006867.4 


Q14242 


A57468 


AAA74577.1 


NP 002997.1 


AAC50061.1 


AAH29782.1 


BAC05283.1 


NP_660341.2 


AAH38808.1 


BAA96067.1 












U:(C-D) 
2.25 












U:(C-D) 
2.25 
















U:(C-D) 
2.24 




U:(C-D) 
2.23 












Mm.28152 












Mm.22173 
















Mm. 8970 
2 




Mm.28248 












AK004809 
BAB23580.1 












NM 009151 
NP 033177.1 
















NM 030255 
NP_084531.1 




AK009960 
XP 133997.2 



e-108 


e-108 


le-87 


2e-62 


2e-62 


2e-62 


3e-62 


CS 
VO 
1 

u 


le-59 


le-59 


o 


o 


le-73 






e-136 


e-136 


e-136 


e-131 


e-131 


cs 

O) 


e-125 


e-120 


e-118 


e-118 


2e-73 


oo 
oo 
m 


oo 
oo 
ro 


o 

CS 
CI 


r- 
CS 


r~ 

CO 
CN 


r~ 
m 

CS 


vo 
rn 
CN 


VO 

tn 

CS 


r~ 

CN 

cs 


LZZ 


Ov 
oo 
SO 


Ov 

oo 
vo 


■/-> 
r- 
cs 


© 


o 

r~ 
vo 


oo 


oo 
■t 


oo 


VO 

vo 


VO 
VO 


vo 
t 


vo 




rsi 

■>*• 


cs 
1- 


-a- 

CS 


similar to KIAA1543 protein 


hypothetical protein 


AF289580 1 unknown 


similar to KIAA1078 protein 


Unknown (protein for IMAGE.3870900) 


KIAA1078 protein 


Tf' 

cs 

s 

VO 

oo 
Q 

.a 

22 
o 

13 

o 

■■C 
<u 

o 


hypothetical protein 


Unknown (protein for IMAGE.3939659) 


hypothetical protein 


hypothetical protein MGC 15523 


AAH14642 Similar to RIKEN cDNA 1810073N04 gene 


unnamed protein product 


KIAA1484 protein 


similar to hypothetical protein MGC7599; clone MGC.7599 


similar to hypothetical protein MGC2656 


hypothetical protein FLJ30803 


unnamed protein product 


KIAA1246 protein 


similar to hypothetical protein MGC2656 


hypothetical protein MGC2656 


AAH03578 Unknown (protein for MGC:2656) 


Similar to KIAA1484 protein 


hypothetical protein MGC3 103 


similar to hypothetical protein MGC3 1 03 


AAH14678 Unknown (protein for IMAGE:3860672) 


XP_048362.1 


CAD38783.1 


AAL55764.1 


XP_036589.2 


AAH11385.1 


BAA83030.2 


T14744 


CAB53664.1 


AAH12778.1 


CAD39184.1 


NP_6 12637.1 


AAH14642.1 


BAC04027.1 


BAA96008.1 


XP 046088.1 


XP 085176.1 


1 099689 dN 


BAB70910.1 


BAA86560.1 


XP 166372.1 


NP_078785.1 


AAH03578.1 


AAH25310.1 


NP 076941.2 


AAH15581.2 


AAH14678.1 






















U:(C-D) 
2.23 






U:(C-D) 
2.21 














































Mm.3310 






Mm.1832 
64 














































NM 024249 
NP_077211.2 






NM 030562 
NP 085039.1 




























o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


o 


© 


© 


© 


© 


e-113 


e-113 


e-113 


e-146 


1489 


ON 
OO 

•fl- 


OS 

oo 


oo 


oo 


oo 
r- 

T 


oo 
r- 


cs 

ON 

o 


cs 

ON 

© 


CM 
ON 
O 


<N 
OS 
O 


cs 

OS 

© 


OS 

oo 
o 


i/N 

oo 

© 


© 


r- 
© 


r- 
© 


1074 


cn 

OS 

oo 


ON 
© 

■■3- 


ON 
© 


OS 

© 


VO 
. — 1 


3',5'-cyclic-GMP phosphodiesterase (EC 3.1.4.35) alpha' chain 


ame cGMP phosphodiesterase 


cGMP phosphodiesterase 


CNRC HUMAN Cone cGMP-specific 3',5'-cyclic phosphodiesterase alpha'-subunit 


cone photoreceptor cGMP-phosphodiesterase alpha' subunit 


phosphodiesterase 6C, cGMP-specific, cone, alpha prime 


phosphodiesterase A' subunit 


phosphodiesterase 6B, cGMP-specific, rod, beta \ 


CNRB HUMAN Rod cGMP-specific 3',5'-cyclic phosphodiesterase beta-subunit 
(GMP-PDE beta) 


3',5'-cyclic-GMP phosphodiesterase (EC 3.1.4.35) beta chain 


rod cGMP phosphodiesterase beta-subunit; PDEB 


3',5'-cyclic-nucleotide phosphodiesterase 


AAH00249 phosphodiesterase 6B, cGMP-specific, rod, beta (congenital stationary 
night blindness 3, autosomal dominant) 


cGMP phosphodiesterase beta subunit 


3',5'-cyclic-GMP phosphodiesterase (EC 3.1.4.35) alpha chain 


phosphodiesterase 6A, alpha subunit 


CNRA HUMAN Rod cGMP-specific 3',5'-cyclic phosphodiesterase alpha-subunit 
(GMP-PDE alpha) (PDE V-Bl) 


U 
V 

s 

-2 

OS 

<u 

-5 
c 
_c 

c 

W 

c 
c 

c 


< 


Rod cGMP phosphodiesterase 


phosphodiesterase 11 A; cyclic nucleotide phosphodiesterase 11A1 


phosphodiesterase 1 1 A 


phosphodiesterase 11A4 


aristaless-like homeobox 3 


JC4520 


CAA64079.1 


2207224A 


P51160 


AAA92886.1 


NP 006195.2 


AAA96392.1 


NP 000274.1 


P35913 


A42828 


AAB22690.1 


CAA46932.1 


AAH00249.1 


CAA44569.1 


B34611 


NP 000431.1 


PI 6499 


AAB69155.1 


CAA62215.1 


NP 058649.2 


BAB16371.1 


BAB62712.1 


NP 006483.1 


U:(C-D) 
2.15 












































U:(C-D) 
2.14 


Mm. 1969 
71 












































Mm. 101 12 


NM 033614 
NP 291092.1 












































NM_007441 
NP 031467.1 



VO 



V© 



ID 



a 
o 

•43 
D. 

•c 



I 

u 

•c 

I 



I-— 

co ro 

X X 
pj l-J 
< < 



© © © © o 



© © — ■ 



o 
on 



-3- 
o 
on 



-3- 
o 

On 



0\ 
NO 
NO 



CN 
CN 



m 



cn 
CN 



CN 
CN 



NO 
CO 



CN 
CN 



no 
co 



CN 
CN 



NO 

co 



CN 
CN 



NO 

co 



o 

CO 



o 
a. 

a 
< 



o 

1-1 

a. 

H 



a 

u 
JS 

,60 

oo 
o\ 
Q 
U 



_3 

3 



I | 

oo E 



VO 

r- 
o 

on 
O 



CO 
CN 
oo 

CN 
NO 
O 



OO 
00 

ON 



CN 

CO 

o 

< 
PQ 



o 
o 
o\ 

oo 
U 

6 



CQ 



o 

p 



m 
oo 

CO 
CO 



oo 



pq 



OA 

r— 
no 

3 



9 

CN 



CO 

m 

CO 



3t ° 

ON ON 

CO O 

r- On 

r? V) 



e-121 


e-121 


e-120 


e-120 


e-120 


t — } 








o 




2e-74 


2e-74 


2e-74 


2e-70 


2e-70 


2e-70 


o 


© 


© 


o 


5e-64 


S 
■ 

u 
in 


9e-64 




CO 


CO 


t-H 

co 


t-H 

co 
-3- 


o\ 
1 co 
r- 


On 
co 
r- 


OS 
CO 

r— 


ON 

CO 
r- 


Ov 
CO 

c— 


0\ 
CO 

c- 


oo 
r- 

CN 


oo 
r- 

CN 


oo 
r-- 
CN 


m 
vo 
CM 


m 
vo 

CN 


vo 

VO 
CN 


1569 


1569 


1569 


1569 


CN 


CN 


CN 


4F2 light chain 


sodium-independent neutral amino acid transporter LATl 


solute carrier family 7 (cationic amino acid transporter, y+ system), member 6 


Similar to Schistosoma mansoni amino acid permease (L25068). 


solute carrier family 7 (cationic amino acid transporter, y+ system), member 6 


C. elegans protein Z37093 homolog [imported] 


similar to C.eleeans orotein CZ37093^ 




D1013901 


similar to PTPLl-associated RhoGAP 1 


minor histocompatibility antigen HA-1 


Similar to PTPLl-associated RhoGAP 1 


PTPLl-associated RhoGAP 1 


PTPLl-associated RhoGAP protein 1 [imported] 


PTPLl-associated RhoGAP 


Gem-interacting protein 


Gem-interacting protein [imported] 


AF132541 1 Gem-interacting protein 


AF391100 1 alsin 


KIAA1563 protein 


alsin 


long form 


hypothetical protein LOC259173 


unnamed protein product 


FLJ00 189 protein 


BAA75746.1 


BAB70708.1 


NP 003974.1 


BAA13376.1 


AAH28216.1 


D59433 


BAA13212.1 


AAC03237.1 


XP 037574.1 


AAN04658.1 


AAH35564.1 


NP 004806.1 


E59430 


AAB81012.1 


NP 057657.1 


D59435 


AAF61330.1 


AAL14103.1 


BAB13389.2 


NP 065970.1 


BAB69014.1 


NP_667340.1 


BAC04237.1 


BAB84944.1 












U:(C-D) 
2.13 
























U:(C-D) 
2.12 
























Mm.5202 
























•<* 
t— 

© 


























AK018130 
BAB3 1085.1 
























AK014320 
BAB29271.1 















SO 

o 



SO 
O 



SO 

o 



SO 

o 



so 
o 



</-> 
o 



<N 
OS 



CN 
OS 



OS 



OS 



OS 

r- 



oo 
m 



oo 

CI 



oo 

CO 



OO 



oo 
o 



cn 
oo 
d 



00 

ci 



cs 
cs 



so 
ci 
o 



so 
o 



SO 

o 
co 



n 
o 
o 



CO 

o 



o 

CN 
I 

a 

a, 



m 
in 



o 

cn 

EES 
ft 

.s 

o 
ft 
u 



ft 

oo 



o 

C-J 

a 

ft 

.3 

o 
& 



•a 

■a 

o 



CO 



ft 

GO 
u 



1 

a 
o 



J3 



ft 



.1° 
I 

o 
ft 

ca 
(3 
o 
is) 



1 
O 

CN 
I 

a 



3 
o 

1 



•S3 P 



<j 

a oo 
8 .g 

Jf 2 
© ^ 

ft CO ^ 

1 MB 

y m a 

ftO Ps 

— ft M 

~ Tics 

S - 1 a 

■ s l J 
1 * £ 

?S QJ ' — 1 



O 
CN 

a 
ft 



o 

CN 

a 

a, 



o 
ft 



o 



g 

t>0 



cn 



ci 
r- 



MM 



s 

3 

•J3 



ft 



00 



Hi 

•a ft- 8 
ir. 

d „» w 
c3 K °s 

J?J| 



o 

3 

O. 



CO 
OO 

oo 
oo 

OS 

u 

3 



OS 

in 
oo 

OS 
SO 

ft 



r- 
SO 
m 
oo 
cn 
ft 



so 
oo 
o 

OS 

in 



o 
o 



CN 

o 

SO 
O 
SO 



SO 

o 

00 



oo 
>n 
«n 



CN 
ro 
'/-> 
OS 
O 
O 

o 



O 

P CN 



9 

P CN 



o 

SO 
SO 



CN 
OS 
r- 
CN 

s 



Os 2 

m 3" 

OS 




s 



cn 
o 



O 



o 



o 



oo 
m 
© 



o 

CO 
C_> 

<u 
1-1 

a. 



os 

On 



« 
X 

o 
o 



C/5 

o 

O 



1 



O 
T3 



•a 

1 

f 

.s 

•a 



2 

CO 

2 
c 



eo 



u 
-a 

■a 

f 

.s 

-a 



so 
2 



O 



as 



o 

1 



2 



s 

CM 
O 



I 

a. 
o 

e 

.3 
-a 
c 

J8 

3 



so 

OS 



o .s 

11 

CO 00 

as « 
«-» o 

CO k-, 

2 & 

- - co - 
1- 08 



0 -2 o 

CO cj CO 

1 -3 3 

O B U 

2 

~0 & 

CM 

■ 9 „£J 

u co « 

CO L__J 

IS 1 

g pl, a. 



<*> is 

J 2 i 
^ c » 
S & 8 

i gi 
ill 

a p o 



a § i) 

is _S _S 
2 

C^i CO CO 



OS 
CM 



T3 _ 

u ca 

j3 "Ho 

&. « 
» 2 

5 & 

0 E 

2 &J 
£:§ S. 

1 go 

a> w « 

00 . » J3 

|«2l 
SJ.g 

002) 
a. 73 se 



.5 

1 



2 o 

s & 

co 

an i co 

€ J J 
&&& 



o\ 

CM 



CM 



CD 
T3 

'* 

2 

& 

-a 
c 



Os 
CM 

r- 



a 

.9 

J 

"So 
2 



CD 
3 



a> 

■a 

o 
<o 
P. 
o 
■a 

s 



CO CO 

S § 



OS 
CM 
t— 



Os 
CM 



Os 
CM 



in 
o 



o 

CO 



5 

O 



eo 



O 

i 

u 

2 
a. 

CM 



Os 
Os 



•a 
2 

u 

-I 



1 



cu 
00 

>s 
X 

o 
_o 

p. 



1 



D 

o 



•a 

s 



CM 



IS 

o 
o 



CM 

a 



OS 

m 

CM 

o 



00 

CM 

cn 



Os 
o 
o 



o 

CM 

■•a- 

U 



o 

IT) 



CO 

00 



m 
o 
00 

cn 



Os 
CM 

% 



e-147 


e-147 


e-147 


e-147 


e-147 


e-143 


3e-66 


3e-66 


VO 

vo 

1 

u 

CO 


3e-66 


3e-66 


e-159 


6e-72 


le-71 


le-71 


5e-69 


7e-69 


7e-69 


7e-69 


7e-69 


7e-69 


7e-69 


»— t 

CN 
</-| 


CN 
v~t 


CN 

m 


CN 

in 


CN 

m 


oo 
o 
m 


m 

CN 


»— i 

m 

CN 


. — 1 

m 

CN 


CM 


>n 

CM 


■ — i 

vo 
m 


© 

r— 

CM 


© 

CN 


© 

CN 


VO 
CM 


© 

VO 
CN 


© 

VO 
CN 


© 

vo 

CN 


© 

VO 
CN 


© 

VO 
CN 


© 

vo 

CN 


forkhead box F2; forkhead (Drosophila)-like 6 


FXF2HUMAN Forkhead box protein F2 (Forkhead-related protein FKHL6) 
(Forkhead-related transcription factor 2) (FREAC-2) (Forkhead-related activator-2) 


forkhead protein FREAC-2 


forkhead protein FREAC-2 


forkhead transcription factor 


transcription factor FREAC-2 


forkhead box Fl; forkhead (Drosophila)-like 5; Forkhead, drosophila, homolog-like 5; 
forkhead-related activator 1 [Homo 
sapiens] 


FXFI HUMAN Forkhead box protein Fl (Forkhead-related protein FKHL5) 
(Forkhead-related transcription factor 1) (FREAC-1) (Forkhead-related activator- 1) 


FREAC-1 ' 


forkhead transcription factor 


transcription factor FREAC-1 


similar to RIKEN cDNA 1 20001 6G03 




cytokeratin 


cytokeratin type II ! 


cytokeratin type II 


keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Cockayne 
types) 


keratin K5 


keratin 5; Keratin-5; 58 kda cytokeratin; keratin, type II cytoskeletal 5; cytokeratin 5 


K2C5_HUMAN Keratin, type II cytoskeletal 5 (Cytokeratin 5) (K5) (CK 5) (58 kDa 
cytokeratin) 


keratin 5, type II, epidermal 


keratin type II 


AF274874 1 keratin 5 


NP_00 1443.1 


Q12947 


T09474 


AAC32226.1 


AAD19875.1 


2208384B 


NP_001442.1 


Q12946 


AAC50399.1 


AAC61576.1 


2208384A 


XP_096612.2 


CAB76832.1 


NP 004684.1 


CAA76730.1 


AAH24292.1 


AAA36145.1 


NP 000415.1 


PI 3647 


A29904 


AAA36143.1 


AAF97931.1 


U:(C-D) 
2.11 






















U:(C-D) 
2.1 






















Mm.6260 






















Mm.3338 


LD 






















NM 010225 
NP 034355.1 






















NM 028770 


NP_083046.1 
























v© 




VO 


oo 


oo 

VO 


OO 
VO 


VO 


r— 

VO 




VO 


VO 


VO 


j — 

VO 


VO 


VO 
VO 


VO 
VO 


vo 

VO 


[ 




p — . 
C^ 




p — 

CM 




rs 


VO 


CN 


ON 
O 


ON 
O 


1 

u 


1 




1 

u 


i 

u 


1 

u 


u 


i 

u 




i 

<u 


i 


1 

u 


0) 


1 

a> 


u 


i 

a 


■ 

u 


u 


1) 


1 

u 


i 


i 

u 


1 


«!> 


1 

u 


■ 

u 


1 

u 


i> 


on 

WO 


OS 
vo 


on 
cn 


o\ 
vo 


o\ 
vo 
CN 


Ov 
vo 

CN 


ON 

cn 


VO 

oo 

wo 


10 
oo 


IO 
CO 


IO 

oo 


lO 

oo 


oo 


ro 
oo 


CO 

oo 


CO 

oo 

f 


VO 


VO 


VO 


VO 


VO 




vo 




CN 
CO 


CN 
ON 


CN 
ON 












oo 
























































fad 

U 








ST 
o 

& 
















































oo 






















































oo 








CN~ 








Q> 

"E 










ro 






























1 

o 
oo 

"3 

<L> 

3 

CO 

O 

U 
1=1 

£ 

1 

fad 

i 








Ph 

& 

cn 
.3 

u 
o 

s. 








<a 
o 

a 
o 

O 
t-l 

p. 
"3 










(X, 

& 

CO 

g 

'53 

I 




















^eratin-8 




oo 

s 
■B 
« 


oo 

.3 

•*— 
S3 
i-i 

.8 


3 
i> 

1 
ft 

o 
!=l 

oo 

■a 

2 


oo 


cn 
.3 

■S3 
o 

V-i 

p., 




.3° 

1 

o 
o 

1 

1 
o 

•5 

1 

i 




CN 

.3 

f-H 

o. 


.3 
& 

P. 


a 

O 
J3 
o 
o 

t 

CN 

.3 
v 
*^ 
o 

IN 

a. 

60 

.3 

1 
o 
o 


00 

o 

1 

o 

J3 

.3 

.8 
o 

& 


cn 
.3 


CN 

.s 
a 

o 

S. 


-J 

CO 
Oh 

o 

>S 
o 

on 

ro" 

.3 

o 

o 
>-< 

o. 


00 

.3 

& 
o 

o 

"a 
• j-i 

■1 
a 

O 
J3 

o 
o 

1 

i 


n 
o 

a 
o 

i 

ccT 
cu 

a 
P 
.3 

o 
a, 




ro 
S 

.s 

o 
u 
P. 


i 

.3 

<a 
•*-» 

o 

& 


ro 

.3 
o 

t-l 

p, 

I 

"p. 
s 
o 

o 


ro 

.3 

u 
o 


co 

.3 

u 
o 

& 


CO 
CO 

O 
P 

•S 
o 

co 

.3 
o 




keratin 8; K 


oo 

■a 

2 
fad 


I 

.1 


■* 

vo 

VO 

o 
© 


oo 

u 
fad 


■a 

2 

u 
o 


on 

•M 
§■ 

o 
o 




CN 

cm 

U 
D 


CN 
Ph 

u 
p 


00 

.3 

o 
o 


I 

o 

1 

3 


t- 

ro 
r~- 


if 

§• 

o 

o 


.3 
1 


.3° 
f 

o 

3 


I 

o 


ro 

PL, 


I 

§• 
o 

o 


co 

Ph 

P 


00 

.3 

f 

u 
3 


M 
1 

u 


ro 

O 
VO 

o 


00 

1 

u 


I 
1 


I 


00 
CO 

P 


i—t 




< 










CN 






1 — 1 




































NP 002264 


cn 
oo 
oo 

<N 

a 


v£> 
NO 
ON 
OO 

i 


VO 
VO 

o 


O 
CN 

c- 
•* 

CO 

< 


r- 
oo 
t— 

VO 

o 

Ah 


CO 
VO 

r- 
vn 

i 


VO 

■<*■ 

CO 
CO 

o 

°l 




VO 

oo 

VO 

vo 

cm 


VO 

CO 
CO 

VO 

O 


© 

ON 
VO 
Ov 

CO 

U 


»— h: 
»— t 


r- 

co 
r- 

*-H 

i 


ON 

o 

CO 
VO 


CN 
O 
-vj- 


i — i 

oo 


CO 
CO 

o 
o 

cu 

z 


VO 

ON 
VO 
VO 
CL, 


CN 

cs 

VO 
VO 

u 


r- 

v© 
ro 
■ — ■ 

VO 

U 


ON 
VO 
CO 

VO 


r- 

vo 

r- 

VO 

U 


oo 
CN 
CN 

a 


CM 
CN 
OO 
OO 

u 


uo 
oo 
r- 

VO 

u 


-vr 

r- 

CO 

o 

s 


VO 
VO 
CO 

VO 

u 
















9 
p 


Ov 

o 

CN 






















































■V* 
-a- 

■vj- 
























































E 


CO 














































































































»— 1 

r- 

VO 
O 

1 


o 

CO 

vo 
ro 
O 

s 










































2e-97 


2e-97 


2e-97 


2e-96 


2e-96 


5e-53 


e-131 


e-131 


e-131 


e-131 


e-131 


e-102 


o 


o 


o 


o 


© 


e-114 


e-114 




e-173 


e-173 


</"> 
ro 


r«"i 
ro 


ci 
m 
ro 


o 
vn 
ro 


o 

V, 

cs 


© 
cs 


VO 
VO 

■*r 


vo 
vo 


VO 
VO 


VO 
VO 


VO 
VO 


o 
t- 

m 


1065 


1064 


1064 


1064 


oo 

ON 

Ov 




o\ 
© 




r- 
© 
vo 


t-- 
© 

VO 
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(Thermogenin) 


uncoupling protein 
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AAH04436 Unknown (protein for MGC:3983) 


AAH06500 Unknown (protein for MGC2366) 


peroxisomal long-chain acyl-coA thioesterase; peroxisomal long-chain acyl-coA 
thioesterase ; putative protein 


AAH06335 peroxisomal long-chain acyl-coA thioesterase 


unnamed protein product 


hypothetical protein FLJ3 1235 


unnamed protein product 


ORF; putative 


similar to Peroxisomal acyl-coenzyme A thioester hydrolase 2 (Peroxisomal 
long-chain acyl-coA thioesterase 2) (ZAP128) 


bile acid Coenzyme A: amino acid N-acyltransferase; glycine N-choloyltransferase 


bile acid-CoA amino acid N-acyltransferase 


bile acid CoA: Amino acid N-acyltransferase 


AAH09567 bile acid Coenzyme A: amino acid N-acyltransferase (glycine 
N-choloyltransferase) 


Tax interaction protein 1 


Tax interaction protein 1 


AF234997_1 glutaminase-interacting protein 3 


AF277318 1 tax-interacting protein 1 


Tax interaction protein 1 
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C04 HUMAN Complement C4 precursor [Contains: C4A anaphylatoxin] 


complement C4A precursor [validated] 


complement component C4A 


complement component 4A preproprotein; acidic C4; Rodgers form of 
C4;complemeht component 4S 
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complement component 4B preproprotein; Chido form of C4; basic C4; complement 
component 4F 


complement component C4 


complement component C4A 


complement C4B precursor 


complement component 3 precursor 


C03 HUMAN Complement C3 precursor 


complement C3 precursor [validated] 


complement component C3 


complement component C4B 


A Chain A, C4adg Fragment Of Human Complement Factor C4a 
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similar to RIKEN cDNA 4930447D24 
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Unknown (protein for MGC:46609) 


Similar to KIAA0940 protein 


Shaw-related voltage-gated potassium channel protein 3; Kv3.3; voltage-gated 
potassium channel protein KV3.3 


KNC3 HUMAN Potassium voltage-gated channel subfamily C member 3 (Potassium 
channel Kv3.3) (KSHIIID) . 
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Shaw-related voltage-gated potassium channel protein 1; voltage-gated potassium 
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Shaw-related voltage-gated potassium channel protein 4 isoform a; voltage-gated 
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AF268896 1 voltage gated potassium channel Kv3.2b 


potassium voltage-gated potassium channel subfamily C member 2 


Shaw-related voltage-gated potassium channel protein 2 isoform KV3.2a 


AF268897 1 voltage gated potassium channel Kv3.2a 


Z148_HUMAN Zinc finger protein 148 (Zinc finger DNA binding protein 89) 
(Transcription factor ZBP-89) 


zinc finger DNA binding protein 89 kDa 


AF432210 1 CLL-associated antigen KW-10 


zinc finger protein 148 (pHZ-52); zinc finger protein 148 (pHZ-52), BERF-1, ZBP-89 


ZBP-89 protein 


CACCC box-binding protein ht-beta 


CACCC box-binding protein 


Similar to zinc finger protein 148 (pHZ-52) 


zinc finger binding protein homolog 


zinc finger protein 


zinc finger protein 281; ZNP-99 transcription factor 
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AAH 12287 Similar to lipase A, lysosomal acid, cholesterol esterase (Wolman disease) 










LICH HUMAN Lysosomal acid lipase/cholesteryl ester hydrolase precursor (LAL) 
(Acid cholesteryl ester hydrolase) (Sterol esterase) (Lipase A) (Cholesteryl esterase) 














LIPG_HUMAN Triacylglycerol lipase, gastric precursor (Gastric lipase) (GL) 


triacylglycerol lipase (EC 3.1.1.3) precursor, gastric 


gastric lipase precursor 


gastric lipase precursor 


A Chain A, Crystal Structure Of Human Gastric Lipase 


B Chain B, Crystal Structure Of Human Gastric Lipase 
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Citation of documents herein is not intended as an 
admission that any of the documents cited herein is 
pertinent prior art, or an admission that the cited 
documents is considered material to the patentability of any 
5 of the claims of the present application. All statements as 
to the date or representation as to the contents of these 
documents is based on the information available to the 
applicant and does not constitute any admission as to the 
correctness of the dates or contents of these documents. 

10 The appended claims are to be treated as a non-limiting 

recitation of preferred embodiments. 

In addition to those set forth elsewhere, the following 
references are hereby incorporated by reference, in their 
most recent editions as of the time of filing of this 

15 application: Kay, Phage Display of Peptides and Proteins: A 
Laboratory Manual; the John Wiley and Sons Current Protocols 
series, including Ausubel, Current Protocols in Molecular 
Biology; Coligan, Current Protocols in Protein Science; 
Coligan, Current Protocols in Immunology; Current Protocols 

2 0 in Human Genetics ; Current Protocols in Cytometry; Current 

Protocols in Pharmacology; Current Protocols in 
Neuroscience; Current Protocols in Cell Biology; Current 
Protocols in Toxicology; Current Protocols in Field 
Analytical Chemistry; Current Protocols in Nucleic Acid 
25 Chemistry; and Current Protocols in Human Genetics ; and 
the following Cold Spring Harbor Laboratory publications : 
Sambrook, Molecular Cloning: A Laboratory Manual; Harlow, 
Antibodies : A Laboratory Manual; Manipulating the Mouse 
Embryo: A Laboratory Manual; Methods in Yeast Genetics : A 

3 0 Cold Spring Harbor Laboratory Course Manual; Drosophila 

Protocols ; Imaging Neurons: A Laboratory Manual; Early 
Development of Xenopus laevis: A Laboratory Manual; Using 
Antibodies : A Laboratory Manual; At the Bench: A Laboratory 
Navigator; Cells: A Laboratory Manual; Methods in Yeast 
35 Genetics : A Laboratory Course Manual; Discovering Neurons: 
The Experimental Basis of Neuroscience; Genome Analysis : A 
Laboratory Manual Series ; Laboratory DNA Science; 
Strategies for Protein Purification and Characterization: A 
Laboratory Course Manual; Genetic Analysis of Pathogenic 
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Bacteria: A Laboratory Manual; PCR Primer: A Laboratory 
Manual; Methods in Plant Molecular Biology: A Laboratory 
Course Manual ; Manipulating the Mouse Embryo: A Laboratory 
Manual; Molecular Probes of the Nervous System; Experiments 
5 with Fission Yeast: A Laboratory Course Manual; A Short 
Course in Bacterial Genetics : A Laboratory Manual and 
Handbook for Escherichia coli and Related Bacteria; DNA 
Science : A First Course in Recombinant DNA Technology; 
Methods in Yeast Genetics : A Laboratory Course Manual; 

10 Molecular Biology of Plants: A Laboratory Course Manual. 

All references cited herein, including journal articles 
or abstracts, published, corresponding, prior or otherwise 
related U.S. or foreign patent applications, issued U.S. or 
foreign patents, or any other references, are entirely 

15 incorporated by reference herein, including all data, 

tables, figures, and text presented in the cited references . 
Additionally, the entire contents of the references cited 
within the references cited herein are also entirely 
incorporated by reference. 

2 0 Reference to known method steps, conventional methods 

steps, known methods or conventional methods is not in any 
way an admission that any aspect, description or embodiment 
of the present invention is disclosed, taught or suggested 
in the relevant art. 

2 5 The foregoing description of the specific embodiments 

will so fully reveal the general nature of the invention 
that others can, by applying knowledge within the skill of 
the art (including the contents of the references cited 
herein) , readily modify and/or adapt for various 

30 applications such specific embodiments, without undue 

experimentation, without departing from the general concept 
of the present invention . Therefore, such adaptations and 
modifications are intended to be within the meaning and 
range of equivalents of the disclosed embodiments, based on 

35 the teaching and guidance presented herein. It is to be 

understood that the phraseology or terminology herein is for 
the purpose of description and not of limitation, such that 
the terminology or phraseology of the present specification 
is to be interpreted by the skilled artisan in light of the 
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teachings and guidance presented herein, in combination with 
the knowledge of one of ordinary skill in the art. 

Any description of a class or range as being useful or 
preferred in the practice of the invention shall be deemed a 
5 description of any subclass (e.g., a disclosed class with 

one or more disclosed members omitted) or subrange contained 
therein, as well as a separate description of each 
individual member or value in said class or range. 

The description of preferred embodiments individually 

10 shall be deemed a description of any possible combination of 
such preferred embodiments, except for combinations which 
are impossible (e.g, mutually exclusive choices for an 
element of the invention) or which are expressly excluded by 
this specification. 

15 If an embodiment of this invention is disclosed in the 

prior art, the description of the invention shall be deemed 
to include the invention as herein disclosed with such 
embodiment excised. 



20 
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CLAIMS 

1. A method of protecting a human subject from 
progression from a normoinsulinemic state to a 
5 hyperinsulinemic state, or from either to a type II diabetic 
state, which comprises administering to the subject a 
protective amount of an agent which is 

(1) a polypeptide which is substantially structurally 
10 identical or conservatively identical in sequence to a 

reference protein which is selected from the group 
consisting of mouse and human proteins set forth in master 
table 1, subtables 1A and 1C, 

15 or 

(2) an expression vector encoding the polypeptide of (1) 
above and expressible in a human cell, under conditions 
conducive to expression of the polypeptide of (1) ; 

20 

where said agent protects said subject from progression from 
a normoinsulinemic state to a hyperinsulinemic state, or 
from either to a type II diabetic state. 

25 2. A method of protecting a human subject from progression 
from a normoinsulinemic state to a hyperinsulinemic state, 
or from either to a type II diabetic state which comprises 
administering to the subject a protective amount of an agent 
which is 

30 

(1) an antagonist of a polypeptide, occurring in said 
subject, which is substantially structurally identical or 
conservatively identical in sequence to a reference protein 
which is selected from the group consisting of mouse and 

35 human proteins set forth in master table 1, subtable IB and 
1C, or 

(2) an ant i- sense vector which inhibits expression of said 
polypeptide in said subject, 
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where said agent protects said subject from progression from 
a normoinsulinemic state to a hyperinsulinemic state, or 
from either to a type II diabetic state. 

5 3. A method of screening for human subjects who are 

prone to progression from a normoinsulinemic state to a 
hyperinsulinemic state, or from either to a type II diabetic 
state, which comprises assaying tissue or body fluid samples 
from said subjects to determine the level of expression of a 

10 "favorable" human marker gene, said human marker gene 

encoding a human protein which is substantially structurally 
identical or conservatively identical in sequence to a 
reference protein which is selected from the group 
consisting of mouse and human proteins set forth in master 

15 table 1, subtables 1A and 1C, 

and directly correlating the level of expression of said 
marker gene with the propensity to progression in said 
patient . 

20 

4. A method of screening for human subjects who have a 
propensity for progression from a normoinsulinemic state to 
a hyperinsulinemic state, or from either to a type II 
diabetic state, which comprises assaying tissue or body 

25 fluid samples from said subjects to determine the level of 
expression of an "unfavorable" human marker gene, said 
human marker gene encoding a human protein which is 
substantially structurally identical or conservatively 
identical in sequence to a reference protein which is 

30 selected from the group consisting of mouse and human 

proteins set forth in master table 1, subtable IB and 1C, 
and inversely correlating the level of expression of said 
marker gene with the propensity to progression in said 
patient . 

35 

5. The method of claims 1 or 3 in which the reference 
protein is of subtable 1A. 

6 . The method of claims 1 or 3 in which the reference 
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protein is of subtable IB. 

7 . The method of claim 3 or 4 in which the sample is a 
muscle tissue sample. 

5 

8. The method of any one of claims 1-7 in which the 
reference protein is a human protein. 

9. The method of any one of claims 1-7 in which the 
10 reference protein is a mouse protein. 

10. The method of any one of claims 3 or 4 in which the 
level of expression of the marker protein is ascertained by 
measuring the level of the corresponding messenger RNA. 

15 

11. The method of any one of claims 3 or 4in which the 
level of expression is ascertained by measuring the level of 
a protein encoded by said marker gene. 

2 0 12. The method of any one of claims 1-9 in which said 

polypeptide is at least 80% identical or at least highly 
conservatively identical to said reference protein. 

13. The method of any one of claims 1-10 in which said 
polypeptide is at least 90% identical to said reference 

25 protein. 

14. The method of any one of claims 1-11 in which said 
polypeptide is identical to said reference protein. 

30 15. The method of any one of claims 1-14 in which the E- 

value cited for the reference protein in Master Table 1 is 
not more than e-6. 

16. The method of claim 15 in which the E-value cited for 
35 the reference protein in Master Table 1 is less than e-10. 



17. The method of claim 17 in which the E value calculated 
by BLASTN or BLASTX would be less than e-15, more preferably 
less than e-20, still more preferably less than e-40, even 




292 

more preferably less than e-60, considerably more preferably 
less than e-80, and most preferably less than e-100. 

18. The method of any of claims 2-17 in which the antagonist 
5 is an antibody, or an antigen-specific binding fragment of 

an antibody. 

19. The method of any of claims 2-17 in which the antagonist 
is a peptide, peptoid, nucleic acid, or peptide nucleic acid 

10 oligomer. 

20. The method of any of claims 2-17 in which the antagonist 
is an organic molecule with a molecular weight of less than 
500 daltons. 

15 

21. The method of claim 20 in which said organic molecule is 
identifiable as a molecule which binds said polypeptide by 
screening a combinatorial library. 

20 22. The method of claim 1 or 2 in which the agent is 
delivered systemically . 

23 . The method of claim 1 or 2 in which the agent is 
selectively delivered to muscle tissue. 

25 



30 
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ABSTRACT OF THE DISCLOSURE 



Mouse genes differentially expressed in comparisons of 
normal vs. hyperinsulinemic , hyperinsulinemic vs. type 2 
diabetic, and normal vs. type 2 diabetic muscle by gene chip 
analysis have been identified, as have corresponding human 
genes and proteins. The human molecules, or antagonists 
thereof, may be used for protection against hyperinsulinemia 
or type 2 diabetes, or their sequelae. 
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